Online distributed tracking of generalized Nash equilibrium on physical networks

In generalized Nash equilibrium (GNE) seeking problems over physical networks such as power grids, the enforcement of network constraints and time-varying environment may bring high computational costs. Developing online algorithms is recognized as a promising method to cope with this challenge, where the task of computing system states is replaced by directly using measured values from the physical network. In this paper, we propose an online distributed algorithm via measurement feedback to track the GNE in a time-varying networked resource sharing market. Regarding that some system states are not measurable and measurement noise always exists, a dynamic state estimator is incorporated based on a Kalman filter, rendering a closed-loop dynamics of measurement-feedback driven online algorithm. We prove that, with a fixed step size, this online algorithm converges to a neighborhood of the GNE in expectation. Numerical simulations validate the theoretical results.


Background
Generalized Nash Game (GNG) problems have received increasing attentions in recent years. For non-cooperative game models, the key problem is how to seek the Generalized Nash equilibrium (GNE), especially in a distributed manner. Distributed GNE seeking algorithms have been widely utilized in ever-growing fields, such as power systems [1][2][3], communication networks [4,5], multi-cloud systems [6,7] and autonomous driving [8]. By exchanging partial information with direct neighbors through a communication network, each player makes his own strategy individually and achieves the GNE after certain rounds *Correspondence: lfeng@tsinghua.edu.cn 1 The State Key Laboratory of Power System, Department of Electrical Engineering, Tsinghua University, Beijing, China Full list of author information is available at the end of the article of iterations. During iteration, the private information of players is largely protected.
Considering GNGs on physical networks, say, a networked power market, it may take a high computational cost to obtain system states, which follows the Kirchhoff law and depends on different operations. It turns to be much more challenging when parameters of the physical network are time-varying. In this context, traditional distributed algorithms will be inadequate to cope with such situations and online algorithms via measurement feedback provide a promising alternative. Instead of being numerically computed, system states are directly measured from the physical system and fed back to drive the online algorithm, rendering a closed-loop algorithm via measurement feedback. In this way, the algorithm can remarkably relieve the computational burden and respond much faster to the time-varying environment. It can further allow to construct online tracking algorithms of GNEs in time-varying environments.
To enable measurement-feedback based online algorithms for GNE tracking, two issues need to be considered. On the one hand, some system states may not be measurable. On the other hand, the measurements always suffer from noise, which may undermine the convergence of an online algorithm. In this regard, state estimation (SE) is often used to refine raw measurements in practice, which works well to mitigate the negative effects of Gaussian noise. In this paper, we develop a measurement-based online distributed GNE tracking algorithm with a Kalman filter based dynamic SE.

Related works 1.2.1 Distributed GNE seeking
In reference [9], a Nesterov-based algorithm was proposed to seek the GNE of an energy sharing game among prosumers, where the proposed algorithm showed better convergence performance than two classical distributed methods: the alternating direction method of multipliers (ADMM) and the gradient descent method. References [10,11] separately studied the distributed forwardbackward algorithm for GNE seeking. The former studied a stochastic GNG, while the latter formulated an asynchronous algorithm paradigm. Reference [12] proved the convergence of a fully distributed GNE seeking method based on the continuous-time consensus and primal-dual gradient dynamics in a partial-information scenario. To tackle a general convex set without an analytic expression, reference [13] proposed a consensus and gradient projection based method to seek an ε-GNE. In reference [14], a multi-cluster game with nonsmooth payoff functions was solved by a projected differential inclusion algorithm, where the correctness and convergence were proved by using the Lyapunov stability theory.
Distributed GNE seeking methods mentioned above are all offline algorithms. Although reference [15] has attempted to formulate an online algorithm to cope with the time-varying future cost function, the measurement feedback has not been well studied.

Online algorithm via measurement feedback
In reference [16], an online gradient projection algorithm was proposed with the measurement as an implicit power flow solution. A varying penalty coefficient was used to guarantee the satisfaction of power flow equations and operational constraints during iteration. Reference [17] formulated a novel online approximate optimization problem based on the quasi-Newton L-BFGS-B method with the measured zero-and first-order information. Reference [18] was a distributed version of [17] based on the distributed interior-point method. In reference [19], an online implementation method was introduced to adjust the time-varying environment in an asynchronous paradigm. To cope with the incomplete measurable system states, reference [20] used a weighted least squares state estimator as feedback and formulated a closed-loop distributed primal-dual gradient algorithm in a singleperiod optimization problem. For a multi-period problem with random process noise, reference [21] utilized a dynamic SE based on the Kalman filter in a centralized gradient projection method, where the seeking performance of the dynamic SE was theoretically proved to converge to the offline optimal solution in expectation.
The works mentioned above all consider online algorithms in global cost optimization problems, whereas the non-cooperative game model with the online algorithm via measurement feedback has not been addressed. This paper aims to partially fill this gap.

Contributions
The major contributions of this paper are two-fold: 1. Model and Algorithm. We formulate a resource sharing market on the physical network as a GNG. To seek the GNE, an online distributed tracking algorithm of the GNE (ODT-GNE) via measurement feedback is proposed based on primal-dual gradients. Measurements from the physical network are utilized as the feedback of the online gradient method, which forms a closed-loop algorithm. To cope with the challenge caused by immeasurable system states and measurement noise, a dynamic SE based on Kalman filter is deployed. To the best of our knowledge, it is the first time that the closed-loop online algorithm via measurement feedback with dynamic SE is investigated in GNE seeking problems. 2. Convergence Analysis. We prove that, with a fixed step size, the ODT-GNE algorithm converges to a neighborhood of the GNE. It is non-trivial to analyze the convergence of the online closed-loop algorithm due to the complex coupling of the gradient-based method, the market clearing, the measurement feedback, the SE, and the physical system equations. Hence we alternatively establish an offline algorithm as a baseline. We characterize the gap between the GNE and the fixed point of the offline algorithm, and then prove the convergence of the online algorithm to the offline algorithm.

Organization
The rest of this paper is organized as follows. Section 2 formulates the GNG model as a resource sharing market. In Section 3, the online distributed GNE Seeking algorithm via measurement feedback is proposed based on its offline version. Section 4 proves the convergence of the proposed online distributed GNE seeking algorithm. Section 5 gives numerical results and Section 6 concludes this paper. Notations: In this paper, we use R n R n + to denote the n-dimensional (nonnegative) Euclidean space. For a column vector x ∈ R n (matrix A ∈ R m×n ), x T A T denotes its transpose. For x, y ∈ R n , we denote the inner product by x, y = x T y, and the 2-norm by x = √ x, x . For a vector x ∈ R n , x i stands for the ith entry. col {x i } i∈I stacks the vectors x i as a new column vector in the order of the index set I. For a matrix B ∈ R m×n , B , B), tr(B), and B i stand for its 2-norm, minimal eigenvalue, trace, and the ith column vector, respectively. Denote by 0 n , 1 n ∈ R n , 0 m,n , 1 m,n ∈ R m×n , and I n ∈ R n×n the vectors of all zeros and ones, the matrices of all zeros and ones, and the identical matrix. For a closed convex set ⊂ R n , we define the projection of x ∈ R n onto as [x] := arg min y∈ y − x . Specially, denote by [x] + the projection onto R n + . This projection operator is nonexpansive, i.e., [x] − y ≤ x − y , ∀x, y ∈ R n . For a set ⊂ R n and a vector x ∈ , the normal cone is defined as N (x) := y ∈ R n | y, z − x ≤ 0, ∀z ∈ .

Resource sharing market on physical networks
We focus on the resource sharing market with the physical network constraint, which consists of three levels: market, prosumers, and network levels. The structure is presented in Fig. 1. Prosumers make transactions through the market and then sharing resources through the physical network.
The set of prosumers denoted by N = {1,...., n}. Denote by x i the prosumer i's volume of trade (x i > 0 means he is a producer and sells resources to the market, x i < 0 means he is a consumer and buys resources from the market). The market clearing price p (x) is defined as where a > 0 is the price elasticity coefficient, x = col {x i } i∈N ∈ R n is the strategy vector, and p 0 is the nominal price. At time t, the prosumer decides his transaction by solving the following individual subproblem where the subscribe −i means all prosumers in N except i, c i x 2 i + d i x i + e i is the generation cost for a producer (the disutility for a consumer), the quadratic coefficient Fig. 1 The structure of the resource sharing market on a physical network c i > 0, (2c) is the network constraint at time t, the constant matrix B ∈ R m×n depends on the network topology and parameters, b t ∈ R m is a time-varying vector.
The network constraint can be reformed as where z ∈ R p is the system state vector, B ∈ R p×n and B ∈ R m×p are constant matrices.
At time t, the problem (2) is a generalized Nash game, which consists of the following elements: 1) the set of players N ; 2) the strategy x i in the strategy set We make the following assumption on the problem.
Assumption A1: 1. The market clearing price is always positive, i.e., p 0 − a i x i > 0. 2. The problem is always feasible (X + t = ∅, ∀t), i.e., there exists an x t in X such that Bx t ≤ b t holds.

Existence and uniqueness of GNE
In this subsection, we define a special GNE of the game (2) and prove the existence and uniqueness of the GNE.

Definition 1 (Generalized Nash Equilibrium) A strategy
where λ (i) ∈ R m + is the Lagrangian multiplier (shadow price) vector of (2c). From Definition 1, the GNE of the problem (2) satisfies the following KKT condition [22,Thm. 3.25] The GNG commonly has a low-dimensional manifold set of GNE, instead of isolated points. In this paper, we focus on a specific GNE, where all prosumers have the identical Lagrangian multiplier vector, i.e., λ = λ (1) = ... = λ (n) . The practical significance is to impose the same shadow price on prosumers associated with the network constraints.

Definition 2 (Generalized Nash Equilibrium) At time t,
x * t is a specific GNE of (2) with the identical Lagrangian multiplier vector λ * t , which satisfies Hereafter, the GNE of (2) means the one satisfying Definition 2.
Then we prove the existence and uniqueness of the GNE. We have the following proposition.
Proof Under Assumption A1, the problem (6) is feasible and the quadratic coefficient matrix A > 0, meaning the existence and uniqueness of the optimal solution. Then the optimal solution must satisfy the KKT condition where μ ∈ R m is the Lagrangian multiplier vector of (6c). Note that (7) is exactly the compact form of (5). Hence the GNE of (2) is equivalent to the optimal solution of (6), which completes the proof.
At the GNE, denote by

Distributed GNE seeking algorithm
In this section, an offline distributed GNE seeking algorithm and its online tracking version are proposed.

Offline distributed GNE seeking
In this subsection, we propose an offline distributed GNE seeking algorithm based on the primal-dual gradient method. Firstly, we add a regularization term −φ λ 2 /2 to the Lagrangian, where φ > 0 is a constant parameter. This improvement contributes to the convergence performance of first-order gradient algorithms. The introduced error will be discussed later.
The regularized Lagrangian is defined as Then we have the following offline distributed GNE seeking algorithm as where α > 0 is the constant step size. For brevity, eliminate the p item and get the compact form as Define the stacked vector u := x T , λ T and the opera- Then the offline algorithm can be described by where the operator F t : R n+m → R n+m is defined as

Online distributed GNE tracking
In order to alleviate the computational burden, the online distributed GNE seeking algorithm is proposed based on the measurement of the system states instead of computation. In addition, there may be several issues in the measurement, for instance, some system states may not be measurable and large measurement error may exist. Hence the state estimation is utilized to refine the raw measured data. By subtracting the time-varying network Eq. (3a), we build the system dynamic equation as where ω z,t = g t − g t−1 is the random process noise representing the time-varying condition of the system. Define the measured value vector y ∈ R q . Consider the following linearized SE as where H ∈ R q×p is the constant observation matrix and ω y,t ∈ R q is the random measurement noise. A dynamic SE with Kalman filter is presented as where z t ∈ R p is the estimated system state vector, P t ∈ R p×p is the estimate covariance matrix, and K t ∈ R p×q is Kalman gain matrix. We make the following assumption on the dynamic system with SE.
Assumption A2: 1 For ∀t, the process noise ω z,t is a Gaussian with the known probability distribution ω z,t ∼ N(0, z ). 2 The measurement noise ω y,t is a Gaussian with the known probability distribution ω y,t ∼ N(0, y ). 3 The observation matrix H has full-column rank. 4 The estimate covariance matrix P t is lower and upper bounded, i.e., there exist constant parameters ρ m , ρ M > 0 such that ρ m I ≤ P t ≤ ρ M I, ∀t. 5 The Kalman gain matrix K t is upper bounded by K t ≤ σ , ∀t. Therefore, I − K t H is also bounded by I − K t H ≤ 1 + σ H , ∀t.
The estimated system states z t are utilized in the updating of λ as Then we define the online algorithm

Algorithm 1 ODT-GNE Algorithm Initialization:
Step size α > 0, initial trade volume x 0 ∈ X , initial Lagrangian multiplier λ 0 ≥ 0, initial estimate covariance matrix P 0 , initial Kalman gain matrix K 0 . For each time t = {1, 2, ...} S1: The market operator measures the value y t and estimates the system states z t by (14). S2: The market operator receives prosumers' trade volumes and sets the clearing price p t by (1). He also updates the network constraint price λ t by (15). S3: The market operator sends the network constraint price B T i λ t and the clearing price p t to each prosumer i. S4: Prosumer i receives the prices, updates his trade volume x i,t by (8a), and then carries out the trade in the physical system. S5: Wait until the next time slot and go to S1.
The concrete algorithm is presented in Algorithm 1 and the algorithm framework is shown in Fig. 2. Denote by p t = p 0 − a i x i,t the market clearing price and F t = i f i x i,t , x −i,t the total cost at time slot t.

Convergence analysis
It takes three steps to transfer the original GNG (2) into the ODT-GNE algorithm: • Unifying the Lagrangian multiplier. The GNE can be a low-dimensional manifold consists of infinite number of non-isolated equilibrium and we hope to identify a unique point (GNE, defined in Definition 2) on the manifold, where all prosumers enjoy an identical Lagrangian multiplier λ. • Regularization term. A quadratic regularization term of λ is added in the Lagrangian to formulate the offline GNE seeking algorithm, which introduces a certain error. • Online reformation. An ODT-GNE algorithm is reformed based on the offline version, where the process and measurement noises introduce errors.
In this section, we bound the errors of the second and third steps, respectively, and then present the main result of GNE tracking performance.

Error of regularization
In this subsection, we prove that F t is a contractive mapping operator, and then show that the offline algorithm converges to a neighborhood of the GNE. We begin with two lemmas on G t . Define c m = min i c i and c M = max i c i .

Lemma 1 Suppose Assumption A1 holds. G t is Mstrongly monotone with M = min {2c m + a, φ}.
Proof . For ∀u, v ∈ R n+m , we have where the second term of (16) equals to 0. The proof is completed directly from the definition of the strongly monotone operator.

Lemma 2 Suppose Assumption A1 holds. G t is L-Lipschitz continuous with L
which completes the proof.
Then we give the Lipschitz continuity of F t , which implies it is a contractive mapping.

Lemma 3 Suppose Assumption A1 holds. If the step size satisfies
then F t is β-Lipschitz continuous (contractive mapping) with Proof . For ∀u, v ∈ R n+m , we have where the first inequality follows from the nonexpansiveness of projection and the second one holds from Lemmas 1 and 2.
Then we have the following proposition immediately by the Banach fixed-point theorem.
Proposition 2 Suppose Assumption A1 holds. If the step size satisfies 0 < α < 2M/L 2 , there exists a unique fixed point denoted by u * t = x * T t , λ * T t T , i.e., F t u * t = u * t . Moreover, start with an arbitrary point u (0) ∈ R n+m , define a sequence {u k } by u k+1 = F t u t , and then u k → u * t .
Then we can present the gap between the fixed point of the offline algorithm and the GNE. Define Theorem 1 Suppose Assumption A1 holds. If the step size satisfies 0 < α < 2M/L 2 , then the gap between the fixed point of the offline algorithm (11) and the GNE of (2) is presented as The KKT condition (7) shows that u * t is the unique fixed point of F t . Then we have where the second inequality holds from the triangle inequality and the third one follows from Lemma 3, the non-expansiveness of projection, and definitions of F t and F t , repeating the third inequality h − 1 times builds the fourth one, and letting h → ∞ completes the proof.

Remark 1 The error caused by the regularization term is proportional to the coefficient φ and the step size α.
On the one hand, with appropriate values of φ and α, the accuracy of the algorithm can be well preserved. On the other hand, adding the regularization term can improve the convergences of dual-based gradient algorithms [23].

Error of online measurement
In this subsection, we prove that the ODT-GNE algorithm converges to the fixed point of the offline algorithm. We start with two lemmas about the time-varying system.

Lemma 4 Suppose Assumption A1 holds. For ∀t, the operator F t satisfies
Proof . From the definition of F t and G t , we have which completes the proof.

Lemma 5
Suppose Assumption A1 holds. If the step size satisfies 0 < α < 2M L 2 , then we have Proof . From the optimality of u * t , we have where the first inequality yields from the triangle inequality, the second one follows from Lemmas 3 and 4, repeating the second inequality h − 1 times builds the third one, and letting h → ∞ completes the proof.
Then we characterize the gap between online and offline operators by z t − z t , the error of system state estimation.

Lemma 6
Supposing Assumption A1 holds. For ∀u ∈ R n+m , we have Proof . The result directly follows from (3) and definitions of F t and F t .

Lemma 7 Supposing Assumption A2 holds. We have
where the parameters are defined as Proof . The result directly follows from [21,24].
Then we prove that the online algorithm converges to a neighborhood of the offline fixed point in expectation.

Theorem 2 Suppose Assumptions A1 and A2 hold. If the step size satisfies
where the parameters are defined as where (a) yields from the triangle inequality, (b) follows from Lemma 5 and definitions of u t+1 and u * t , (c) holds from the triangle inequality and E ω z,t+1 ≤ E ω z,t+1 2 = √ tr ( z ), (d) yields from Lemmas 3 and 6, (e) follows from E X ≤ E X 2 , (f) holds from Lemma 7, (g) is derived by repeating (f ) t times, and (h) follows from the definition of ξ = max β, Remark 2 Theorem 2 justifies the convergence of the ODT-GNE algorithm. The error of the online algorithm depends on three parts: 1) the initial deviation of primal and dual variables E u 0 − u * 0 ; 2) the initial error of system state estimation E z 0 − z 0 2 ; 3) the covariance matrices of process and measurement noises y , z . The coefficients of the first and second terms decrease to 0, while the third one remains. If the system is noiseless, i.e., tr y = tr ( z ) = 0, the remained term C 2 vanishes, and hence the ODT-GNE algorithm converges exactly to the curves of u * t , which indicates the generality of our results.

Main result
The combination of Theorems 1 and 2 formulates the following theorem that characterizes the tracking performance of the ODT-GNE algorithm.

Theorem 3 Suppose Assumptions A1 and A2 hold. If the step size satisfies
Proof . The result directly follows from the triangle inequality u t − u * t ≤ u t − u * t + u * t − u * t and Theorems 1 and 2.

Illustrative example
In this section, we validate the online distributed algorithm in the real-time energy sharing market of the distribution network considering the bus voltage control.

Energy sharing in distribution networks
Prosumers share energy through the distribution system, which may cause the violations of bus voltages. We first formulate the physical model and then show that it follows our primal GNG model (2).
Prosumers decide their traded energy x i in the strategy set X i . x i means that the prosumer sells energy and the power is injected into the distribution network. Otherwise, the prosumer buys energy and the power flows out from the distribution network. The relation between bus voltages and prosumers' decisions can be described by the LinDistFlow model [16,19] as for j ∈ N i , where P d j,t , Q d j,t , v j and x j are the inelastic active and loads, the magnitude of bus voltage, and the traded energy of prosumer at bus j, z i = v 2 i /2 is the vector of the half squared magnitude of voltage, r ij , x ij , P ij , Q ij , and l ij are the resistance, reactance, active and reactive power, and the squared magnitude of current through the line i, j , N j is the set of j's child buses.
By eliminating P ij and Q ij , we have the compact form of the LinDistFlow model as where the constant matrices R and X are derived from r ij and x ij , respectively. The bus voltages should be limited in a range z, z , i.e, Formulas (20) and (21) jointly constitute the network constraint (2c). Noting that (2a) and (2b) are originally suitable, the energy sharing market of the distribution network follows the GNG model (2).
For the measurement feedback, we assume that measurement devices are only installed in a part of buses denoted by N − ⊂ N . For i ∈ N − , the measured values consists of the bus voltage v i and the line power P ij , Q ij . Then we have the measurement equations as where the second equation holds from (19c).
Define the measured value vector y = [z i ] m , s ij m , the measurement noise ω y = ω z i , ω s ij , which derives the measurement equation The analyses above show that our online distributed GNE tracking algorithm can be used in the energy sharing market of the distribution network.

Settings
The test system in this paper is the IEEE 14-bus system with eight prosumers and four measurement devices. The topology of the distribution network is presented in Fig. 3, where prosumers are at buses 2, 5,7,8,9,11,12, and 13, while measurement devices are at buses 2,5,6, and 8. The voltage of the point of common coupling is v 0 = 1.03 p.u.. The parameters of base loads and distribution lines are taken from MatPower [25]. All parameters are given in the p.u. The studies are carried out on a desktop with Intel i7-10700 CPU and 16 GB memory. The simulation platform is MATLAB 2016B and commercial solver CPLEX is utilized to solve the formulated problems with the intermediary toolbox YALMIP.

Results
In this subsection, we validate the algorithm in the IEEE 14-bus distribution network. We obtain 100 results by running the simulation 100 times. In the following figures, lightly painted areas represent the possible regions of the results, while the dark bold curves are the mean values of these 100 results. Figure 4 shows the norms of the GNE curve x * t and the real strategy generated by the ODT-GNE algorithm x t , and their deviation. It is clear that the deviation decays very quickly, verifying the convergence result in Theorem 2. After about 5 min, the GNE tracking results show the acceptable performance. Compared with the GNE tracking performance, the error of the total cost of prosumers in the GNE is much smaller. As can be seen in Fig. 5, the cost in the GNE and the cost generated by the ODT-GNE algorithm are quite close. In most time slots, the relative error of cost is not more than 0.05%.
The curves of the market clearing prices display a similar phenomenon as the total cost. In Fig. 6, the market clearing price generated by the ODT-GNE algorithm converges quickly to the price in the GNE. After about 5 min, the relative error is limited within 10% and the mean value is only 2%.
As can be seen in Fig. 7, the deviation curves of the real system states and the estimated values validate the accuracy of the dynamic SE with the Kalman filter. During the first 5 min, the estimation error converges rapidly. After 5 min, the dynamic SE displays the remarkable performance on combating process and measurement noises and tracking the system states, which verifies Lemma 7. The relative error of the system state estimation is not more than 0.01%, as time goes by. Figure 8 shows the real bus voltage profiles of all 13 buses. The bus voltages rarely violate the bounds which are marked by red dotted lines in the figure. Bus voltages are well regulated within the network constraint, although the system is severely variant, the system state is not entirely observable, and the measurement is disturbed by the random noise.

Conclusion
In this paper, we have studied the online distributed tracking algorithm of the GNE of the resource sharing market on the physical network. To this end, we have combined the distributed GNE seeking method and the measurement with SE together to formulate a closed-loop algorithm. The measurement of system states relieves the Fig. 5 The curves of the total costs: F * t in the GNE and F t generated by ODT- GNE  Fig. 6 The curves of the market clearing prices: p * t in the GNE and p t generated by ODT- GNE   Fig. 7 The deviation of the real system state vector z t and its estimated value z t computational cost and improves the GNE tracking performance on the time-varying physical network. We have proved that the online closed-loop algorithm converges to a neighborhood of the GNE in expectation. Numerical simulation verifies the tracking performance of the online distributed GNE tracking algorithm via measurement feedback. It is expected that this work could provide useful insights and facilitate the implementations of online algorithms via measurement feedback in GNGs, which would inspire more applications in a wide broad of fields.

Code availability
The simulation platform is MATLAB 2016B and commercial solver CPLEX is utilized to solve the formulated problems with the intermediary toolbox YALMIP.