On Risk Evaluation and Control of Distributed Multi-Agent Systems

In this paper, we deal with risk evaluation and risk-averse optimization of complex distributed systems with general risk functionals. We postulate a novel set of axioms for the functionals evaluating the total risk of the system. We derive a dual representation for the systemic risk measures and propose a way to construct non-trivial families of measures by using either a collection of linear scalarizations or non-linear risk aggregation. The new framework facilitates risk-averse sequential decision-making by distributed methods. The proposed approach is compared theoretically and numerically to some of the systemic risk measurements in the existing literature. We formulate a two-stage decision problem with monotropic structure and systemic measure of risk. The structure is typical for distributed systems arising in energy networks, robotics, and other practical situations. A distributed decomposition method for solving the two-stage problem is proposed and it is applied to a problem arising in communication networks. We have used this problem to compare the methods of systemic risk evaluation. We show that the proposed risk aggregation leads to less conservative risk evaluation and results in a substantially better solution of the problem at hand as compared to an aggregation of the risk of individual agents and other methods.


Introduction
Evaluation of the risk of a system consisting of multiple agents is one of the fundamental problems relevant to many fields.A crucial question is the assessment of the total risk of the system taking into account the risk of each agent and its contribution to the total risk.Another issue arises when the risk evaluation is based on confidential or proprietary information.There is extensive literature addressing the properties of risk measures and their use in finance.Our goal is to address situations related to robotics, energy systems, business systems, logistic problems, etc.The analysis in financial literature may not be applicable in such situations due to the heterogeneity of the sources of risk, the nature, and the complexity of relations in those systems.In many systems, the source of risk is associated with highly nontrivial aggregation of the features of its agents, which may not be available in an analytical form.For example, in automated robotic systems, the exchange of information may be limited or distorted due to the speed of operation, the distance in space between the agents, or other reasons.Another difficulty associated with the evaluation of risk arises when the risk of one agent stems from various sources of uncertainty of different nature.The question of how to aggregate those risk factors in one loss function does not have a straightforward answer.
The risk of one loss function can be evaluated using a coherent measure of risk such as Average Value-at-Risk, meansemideviation or others.More traditional (non-coherent) measures of risk such as Value-at-Risk (VaR) are also very popular and frequently used.We refer to [14] for an extensive treatment of risk measures for scalar-valued random variables, as well as to [31] where risk-averse optimization problems are analyzed as well.
The main objective of this paper is to suggest a new approach to the risk of a distributed system and show its viability and potential in application to risk-averse decision problems for distributed multi-agent systems.While building on the developments thus far, our goal is to identify a framework that is theoretically sound but also amenable to efficient numerical computations for risk-averse optimization of large multi-agent systems.We propose a set of axioms for functionals defined on the space of random vectors.The random vector is comprising risk factors of various sources, or is representing the loss of each individual agent in a multi-agent system.While axioms for random vectors have been proposed earlier, our set of axioms differs from those in the literature most notably with respect to the translation equivariance condition, which we explain in due course.The resulting systemic risk measures reduce to coherent measures of risk for scalar-valued random variables when the dimension of the random vectors becomes one.We derive the dual representation of the systemic measures of risk with less assumptions than known for multi-variate risks.In our derivation, we establish one-to-one correspondences between the axioms and properties of the dual variables.We also propose several ways to construct systemic risk measures and analyze their properties.The important features of the proposed measures are the following.They are conformant with the axioms; they can be calculated efficiently, and are amenable to distributed optimization methods.
We have formulated a risk-averse two-stage optimization problem with a structure, which is typical for a system of loosely coupled subsystems.The proposed numerical method is applied to manage the risk of a distributed operation of agents.The distributed method lets each subsystem optimize its operation with minimal information exchange among the other subsystems (agents).This aspect is important for multi-agent systems where some proprietary information is involved or when privacy concerns exist.The method demonstrates that distributed calculation of the systemic risk is possible without a big computational burden.We then consider a two-stage model in wireless communication networks, which extends the static model discussed in [21].It addresses a situation when a team of robots explores an area and each robot reports relevant information.The goal is to determine a few reporting points so that the communication is conducted most efficiently while managing the risk of losing information.We conduct several numerical experiments to compare various systemic risk measures.
Our paper is organized as follows.In section 2 we provide preliminary information on coherent measures of risk for scalar-valued random variables and survey existing methods for risk evaluation of complex systems.Section 3 contains the set of axioms, the dual representation associated with the resulting systemic risk measures, and two ways to construct such measures in practice.Section 4 provides a theoretical comparison of the new measures of risk to other notions.In particular, we discuss other sets of axioms, explore relations to two notions of multivariate Average Value-at-Risk, and pay attention to the effect of the aggregation of risk before and after risk evaluation.In section 5, we formulate a risk-averse two-stage stochastic programming problem modeling wireless information exchange and seeking to locate a constraint number of information exchange points.We devise a distributed method for solving the problem and report a numerical comparison with several measures of risk, and other systemic measures.We pay attention to the comparison between the principles of aggregation for the purpose of total risk evaluation.

Coherent risk measures
The widely accepted axiomatic framework for coherent measures of risk was proposed in [2] and further analyzed in [8], [14], [20], [29,30], [25] and many others works.It is worth noting that another axiomatic approach was initiated in [18] and this line of thinking was developed to an entire framework in [27].For a detailed exposition, we refer to [31] and the references therein.Let L p (Ω, F, P ) be the space of real-valued random variables, defined on the probability space (Ω, F, P ), that have finite p-th moments, p ∈ [1, ∞), and are indistinguishable on events with zero probability.We shall assume that the random variables represent random costs or losses.A lower semi-continuous functional ϱ : L p (Ω, F, P ) → R ∪ {+∞} is a coherent risk measure if it is convex, positively homogeneous, monotonic with respect to the a.s.comparison of random variables, and satisfies the the following translation property ϱ is monotonicity, convex, and satisfies the translation property, then it is called a convex risk measure.Some examples of coherent measures of risk include Average Value-at-Risk (also called Conditional Value-at-Risk) and mean-semideviations measure, which are defined as follows.The Average Value-at-Risk at level α for a random variable Z is defined as It is a special case of the higher-order measures of risk: where ∥ • ∥ p refers to the norm in L p (Ω, F, P ).The mean semi-deviation of order p is given by The space L p (Ω, F, P ) equipped with its norm topology is paired with the space L q (Ω, F, P ) equipped with the weak * topology where 1 p + 1 q = 1.For any Z ∈ Z and ξ ∈ Z * , we use the bilinear form: The following result is known as a dual representation of coherent measures of risk.A proper lower semicontinuous coherent risk measure ϱ has a dual representation where Risk measures have also been defined by specifying a set of desired values for the random quantity in question; this set is called an acceptance set.Denoting the acceptance set by K ⊂ R, the risk of a random outcome Z is defined as: In finance, this notion of risk is interpreted as the minimum amount of capital that needs to be invested to make the final position acceptable.It is easy to verify that ϱ[•] in ( 2) is a coherent measure if and only if K is a convex cone (cf.[13]).

Risk measures for complex systems
As the risk is not additive, when we deal with distributed complex systems, we need to address the question of risk evaluation for the entire system.This risk is usually called systemic in financial literature and the proposed measures for its evaluation are termed systemic risk measures.
Assume that the system consists of m agents.One approach to evaluating the risk of a system is to use an aggregation function, Λ : R m → R, and univariate risk measures.Let X ∈ L p (Ω, F, P ; R m ) be an m-dimensional random vector comprising the costs incurred by the system, where each component X i corresponds to the costs of one agent.The first approach to systemic risk is to choose a univariate risk measure ϱ 0 and apply it to the aggregated cost Λ(X).If we prefer to use an acceptance set K as in (2), the systemic risk can be defined as: In ( [7] this point of view is analyzed in finite probability spaces and it is shown that any monotonic, convex, positively homogeneous function provides a risk evaluation as in (3) as long as it is consistent with the preferences represented in the definition of K.The point of view presented in definition (3) is further extended in [11], where the authors analyzed convex risk measures defined on a general measurable space and proposed examples of aggregation functions suitable for a financial system.In both studies, the structural decomposition of the systemic risk measure (3) is established when the aggregation function Λ satisfies properties similar to the axioms postulated for risk measures.In [4], the authors considered a particular case of an aggregation function, proposing an evaluation method for the risk associated with the cumulative externalities or costs endured by financial institutions.Note that these evaluation methods rely on a choice of one aggregation function suitable for a specific problem.The translation property for constant vectors is introduced in [5] for convex risk measures defined for bounded random vectors.This property differs from the one we propose here.The authors analyzed the maximal risk over a class of aggregation functions rather than using one specific function.We refer to [28] for an overview of the risk measures constructed this way.A similar approach is taken in [10], where law-invariant risk measures for bounded random vectors are investigated for the purpose of obtaining a Kusuoka representation.The axioms proposed in [5,10] are closest to ours and we provide more detailed discussion in section 3.
Another approach to risk evaluation of complex systems consists of evaluation of the risk of individual agents first and aggregation of the obtained values next.This method is used, for example, in [3] and [12].Using the notion of acceptance sets the systemic risk measure is defined in [3] in the following way: The proposed measures of risk in section 3 also accommodate this point of view.A further extension in [3] replaces the constant vector z ∈ R m by a random vector Y ∈ C, where C is a given set of admissible allocations.This formulation of the risk measure allows to decide scenario-dependent allocations, where the total amount m i=1 z i can be determined ahead of time while individual allocations z i may be decided in the future when uncertainty is revealed.In [12] a setvalued counterpart of this approach is proposed by defining the systemic risk measure as the set of all vectors that make the outcome acceptable.Once the set of all acceptable allocations is constructed, one can derive a scalar-valued efficient allocation rule by minimizing the weighted sum of components of the vectors in the set.Set-valued risk measures were proposed in [17], see also [1,16] for duality theory including the dual representation for certain setvalued risk measures.In fast majority of literature, the systemic risk depends on the choice of the aggregation function Λ and how well it captures the interdependence between the components.To capture the dependence, an approach based on copula theory was put forward in [24].It is assumed that independent operation does not carry systemic risk and, hence, the local risk can be optimized by each agent independently.The systemic risk measures are then constructed based on the copulas of the distributions.
Another line of work includes methods that use some multivariate counterpart of the univariate risk measures.The main notion here is the Multivariate Value-at-Risk (MVaR) for random vectors, which is identified with the set of pefficient points.Let F X (•) be the right-continuous distribution function of a random vector X with realizations in R m .A p-efficient points for X is a point v ∈ R m such that F X (v) ≥ p and there is no point z that satisfies F X (z) ≥ p with z ≤ v componentwise.This notion plays a key role in optimization problems with chance constraints (see e.g.[31]).Multivariate Value-at-Risk satisfies the properties of translation equivariance, positive homogeneity and monotonicity.This notion is used to define Average Value-at-Risk for multivariate distributions (MAVaR) in [19,23,26].Let Z p be the set of all points, each of which is component-wise larger than some p-efficient point: In [19], Lee and Prekopa define the MAVaR of a random vector X at level p as where Λ is assumed integrable with respect to F X , i.e., E(Λ(X)) is finite.It is shown in [19] that MAVaR is translation equivariant, positive homogeneous and subadditive only when all of the components of the random vector are independent.
While the definition of MAVaR above is scalar-valued, in [22] the authors define a Multivariate Average Value-at-Risk (MAVaR) using the notion of p-efficient points as MVaR p (X) and the extremal representation of the Average Value-at-Risk.First for given probability p ∈ (0, 1), we consider the vectors where Then, the following vector-optimization problem is solved: The vector-valued Multivariate Average Value-at-Risk is monotonic, positively homogeneous, translation equivariant, but is not subadditive.Note that in both MVaR and MAVaR, one needs to use a scalarization function to obtain a scalar value for the risk.
We shall compare our proposal to the aforementioned risk measures in section 4.

Axiomatic Approach to Risk Measures for Random Vectors
In this section, we propose a set of axioms to measures of risk for random vectors with realizations in R m .This framework is analogous to the coherent risk measures properties for scalar-valued random variables.In fact, if m = 1, the proposed set of axioms exactly coincides with those in [31].We denote by Z = L p (Ω, F, P ; R m ) be the space of random vectors with realizations in R m , defined on (Ω, F, P ).Throughout the paper, we shall consider risk measure ϱ for random vectors in Z to be a lower-semi-continuous functional ϱ : Z → R ∪ {+∞} with non-empty domain.We denote the m-dimensional vector, whose components are all equal to one by 1 and the random vector with realizations equal to 1 by I. Definition 1.A lower semi-continuous functional ϱ : Z → R ∪ {+∞} is a coherent risk measure with preference to small outcomes, iff it satisfies the following axioms: A1. Convexity: For all X, Y ∈ Z and α ∈ (0, 1), we have: A3. Positive homogeneity: For all X ∈ Z and t > 0, we have ϱ A4. Translation equivariance: For all X ∈ Z and a ∈ R, we have ϱ A lower semi-continuous functional ϱ : Z → R ∪ {+∞} is a convex risk measure with preference to small outcomes, iff it satisfies axioms A1, A2, and A4.
The axioms of convexity and positive homogeneity are defined in a similar way to the properties of coherent risk measures, while the random vectors are now compared component-wise for the property of monotonicity.The main difference is the definition of a translation equivariance axiom.It suggests that if the random loss increases by a constant amount for all components, then the risk should also increase by the same amount.These axioms differ from the previous axioms proposed in the literature.

Dual representation
In order to derive a dual representation of the multivariate risk measure, we pair the space of random vectors and the conjugate of ϱ * (the bi-conjugate function) is Fenchel-Moreau theorem implies that if ϱ[•] is convex and lower semicontinuous, then ϱ * * = ϱ and that where Ã = dom(ϱ * ) is the domain of the conjugate function ϱ * .Then based on the Fenchel-Moreau theorem and the axioms proposed in this paper, we show the following theorem.Proof.Since ϱ[•] is convex and lower semicontinuous and we have assumed that it has non-empty domain, the representation (5) holds by virtue of the Fenchel-Moreau theorem.
Take any X with support in ∆ such that ϱ[X] is finite and define X t := X −t X.Then for t ≥ 0, we have that X ⪰ X t , and ϱ[X] ≥ ϱ[X t ] by monotonicity.Consequently, It follows that ϱ * [ζ] = +∞ for every ζ ∈ Z * with at least one negative component, thus ζ / ∈ dom ϱ * .Conversely, suppose that ζ ∈ Z * has realizations in R m with nonnegative components P -a.s.Then whenever X ⪰ X ′ , we have: Hence, the monotonicity condition holds.
It follows from Theorem 1 that if a risk measure ϱ is lower semicontinuous and satisfies the axioms of monotonicity, convexity and translation equivariance, then representation (6) holds with the set A defined as: Proof.If ϱ is also positive homogeneous, then ϱ is the support function of A = dom(ϱ * ).Then To show the form of the set A recall that Hence, for all ζ ∈ A, (6)  We shall consider further the following property.
In the paper [5], the authors have adopted the following translation axiom: T. For any constant α ∈ R and any vector e i whose i-th component is 1 (i = 1, . . ., m) and all other components are zero, we have ϱ[X + αe i ] = ϱ[X] + α.Theorem 2. Assume that ϱ is a proper lower-semicontinuous convex risk functional.Property T holds if and only if Proof.Suppose T holds.Then for a random vector X in the domain of ϱ and every ζ ∈ Z * , we have This entails that for every constant vector a ∈ R m , the risk value is The other direction is straightforward.Indeed, Additionally, property T also implies that Due to equation (12), for all X ∈ Z and a ∈ R, we obtain which completes the proof.
On Risk Evaluation and Control of Distributed Multi-Agent Systems We also observe that a particular implication of Theorem 2 is that risk measures are linear on constant vectors.Corollary 3. If a coherent measure of risk ϱ[•] satisfies property T, then it is linear on constant vectors.
Proof.Indeed, a special case of (11) shows that This combined with the fact that ϱ[0] = 0 and the positive homogeneity of the risk measure proves the statement.
In [10], the authors have analyzed law-invariant risk measures for bounded random vectors.They have introduced a set of axioms that are closest to ours: their axioms include our axioms together with the two normalization properties ϱ[I] = 1 and ϱ[0] = 0. We do not need these normalization properties to establish the dual representation for general random vectors with finite p-moments, p ≥ 1; we derive that the risk of the deterministic zero vector is zero from the dual representation.The property of strong coherence of risk measures, introduced in that paper implies in particular that ϱ[a , which appears to be a strong assumption.

Risk measures obtained via sets of linear scalarizations
Suppose we have a random vector X ∈ Z = L p (Ω, F, P ; R m ) with a right-continuous distribution function F (X; •) and marginal distribution function F i (X i ; •) of each component i = 1, . . ., m.We consider linear scalarization using vectors taken from the simplex Let ϱ : L p (Ω, F, P ) → R ∪ {+∞} be a lower semi-continuous risk measure.For any fixed set S ⊂ S + m , we define the risk measure It is straightforward to see that X S ∈ L p (Ω, F, P ) and hence, the risk measure ϱ S [•] is well-defined on L p (Ω, F, P ; R m ).
Thus, the convexity axiom is satisfied.Given a random vector X ∈ Z and a constant t ∈ R, it follows: Positive homogeneity follows in a straightforward manner.
If the set S is a singleton, we obtain the following.Corollary 4. Let ϱ : L p (Ω, F, P ) → R ∪ {+∞} be a coherent (convex) risk measure.For any vector c ∈ S + m , the risk measure ϱ c [X] = ϱ[c ⊤ X] is coherent (convex) according to Definition 1.
Using the dual representation of the coherent risk measure ϱ for scalar-valued random variables, we obtain the following: Additionally, a measurable selection ν X (ω) ∈ arg max c∈S c ⊤ X(ω) exists by the Kuratowski-Ryll-Nadjevski theorem; we shall use the notation ν X ∈ S for any such selection.
Notice that the representations just derived have the form of the dual representation in ( 6), however we have not established that Ã coincides with the domain of its conjugate function.
We observe the following properties of the aggregation by a single linear scalarization.Proposition 1.Given a coherent risk measure ϱ : Z → R and a scalarization vector c ∈ S + m , for any random vector X ∈ L p (Ω, F, P ; R m ) risk of the vector measured by ϱ[c ⊤ X] does not exceed the maximal risk of its components measured by ϱ[•].Furthermore, the following relation between aggregation methods holds ϱ Proof.The dual representation implies the following: The penultimate relation implies the second claim of the theorem.
We also show the following useful result, which implies that we can use statistical methods to estimate the systemic risk measure ϱ S [X].Proposition 2. If ϱ : L p (Ω, F, P ) → R ∪ {+∞} is a law-invariant risk measure, then for any set S ⊂ S + m , the systemic risk measure ϱ S [X] = ϱ[X S ] is law-invariant.
Proof.It is sufficient to show that for two random vectors X and Y , which have the same distribution, the respective random variables X S and Y S have the same distribution.
We observe that c ⊤ X and c ⊤ Y have the same distribution for any vector c ∈ R m .Hence, for any real number r, the following relations hold: which shows the equality of the distribution functions.

Systemic Risk Measures Obtained via Nonlinear Scalarization
The second aggregation method that falls within the scope of our axiomatic framework is that of nonlinear scalarization.This class of risk measures cannot be obtained within the framework of aggregations by non-linear functions, and does not fit the axiomatic approaches in [7] or in [5].Furthermore, we shall see that this method of evaluating systemic risk allows to maintain fairness between the system's participants.
We define Ω m = {1, . . ., m} and consider a probability space (Ω m , F c , c), where c ∈ S + m and F c contains all subsets of Ω m .We view c as a probability mass function of the space Ω m .Given an m-dimensional random vector X ∈ Z = L p (Ω, F, P ; R m ) and a collection of m univariate measures of risk ϱ i : (Ω, F, P ) → R, i = 1, . . ., m, we define the random variable X R on the space Ω m as follows: Choosing a scalar measure of risk ϱ 0 : (Ω m , F c , c) → R, the measure of systemic risk ϱ s : L p (Ω m , F c , c) → R is defined as follows: This is a nonlinear aggregation of the individual risks ϱ[X i ], hence this approach falls within the category of methods that evaluate the risk of each component first and then aggregate their values.The measure ϱ s [X] satisfies the axioms postulated for systemic risk measures.
Proof.(i) Given any X, Y ∈ Z and α ∈ (0, 1), we consider the random vector we obtain that Z R ≤ Z ′ .Using the monotonicity and convexity of ϱ 0 , we obtain (ii) Suppose the vectors X, Y ∈ Z satisfy X ≤ Y a.s.This implies that X i ≤ Y i a.s.and, hence, Thus (A2) is satisfied.
(iii) Given a random vector X ∈ Z, t > 0, we have where we have used the positive homogeneity property of ϱ i [•] for all i = 0, 1, . . ., m.
(iv) Given a random vector X ∈ Z and a constant a, we have This shows property (A4).

Examples A. Systemic Mean-AVaR measure
Consider the case when ϱ 0 is a convex combination of the expected value and the Average Value-at-Risk at some level α and all components of X are evaluated by the same measure of risk ϱ[•].Then for any κ ∈ [0, 1] and c ∈ S + m , we have: Here the infimum with respect to η ∈ R is taken over the individual risks of the components ϱ[X i ], i = 1, . . ., m.
Hence, this method of aggregation imposes additional penalties for the components whose risk exceeds some threshold.

B. Systemic Mean-Semideviation measure
Now let ϱ 0 be a Mean-Upper-Semideviation risk measure of the first order and all components of X are evaluated by the same measure of risk ϱ[•].Then the measure of systemic risk can be defined as: The last representation shows that this risk measure is an aggregation of the individual risk of the components, which compares the risk of every component with the weighted average risk of all components and penalizes the deviation of the individual risk from that average.
The presented method of non-linear aggregation maintains fairness within the system and keeps the components functioning within the same level of risk.

Relations to multivariate measures of systemic risk
In this section, we compare the proposed risk measures with the multivariate notions mentioned in section 2.2.
Consider first the Multivariate Value-at-Risk (MVaR) is given as the set of p-efficient points of the respective probability distribution.The following facts are shown in [9].For every p ∈ (0, 1) the level set Z p of a the distribution function of a random vector X is nonempty and closed.For a given scalarization vector c ≥ 0, the p-efficient points can be generated by solving the following optimization problem: For every c ≥ 0 the solution set of the optimization problem ( 17) is nonempty and contains a p-efficient point.Hence, given a random vector X ∈ Z and a scalarization vector c ∈ S + m , MVaR at level p ∈ (0, 1) can be calculated as: Therefore, using linear scalarizations, one can find the p-efficient point corresponding to any given vector c ∈ S + m .Consider now the Multivariate Average Value-at-Risk (MAVaR) defined in (4).When small outcomes are preferred then, the unfavorable set of realizations of a random vector X is given by the p-level set of F (X; If the scalarization function ψ(X) is monotonically nondecreasing, then P (ψ(X) ≤ ψ(v)) ≥ p. Denote the p-quantile of ψ(X) by η X (p).Then we observe that η X (p) ≤ min v ψ(v).Therefore: for all p ∈ (0, 1) where the cumulative distribution function of ψ(X) is continuous.It follows that the Average Valueat-Risk of scalarized X by a monotonically nondecreasing function ψ(X) has a smaller value than MAVaR defined in (4).This implies in particular that for any S ⊂ S + m , MAVaR p (X) ≥ ϱ S [X].We not turn to the Vector-valued Multivariate Average Value-at-Risk.It is calculated as one of the Pareto-efficient optimal solution of the following optimization problem: It is well-known that a feasible solution of a convex multiobjective optimization problem is Pareto-efficient if and only if it is an optimal solution of the scalarized problem with an objective function which is a convex combination of the multiple objectives.Then VMAVaR, which is the Pareto-efficient solution of the multiobjective optimization problem 18, is also optimal for the following problem: where c ∈ R m is a scalarization vector taken from the simplex S + m .Now for X ∈ L p (Ω, F, P ; R m ), we consider: due to the convexity of the max function.It follows that: In the scalar-valued case (m = 1) the minimizer of the optimization problem defining AVaR p (Z) is the VaR p (Z) for a random variable Z.In the multivariate case (m > 1), we established that the solution of ( 17) is the p-efficient point, or VaR p (X), corresponding to a given scalarization vector c ∈ S + m .Denoting this p-efficient point as v(c), it follows that: Denoting the p-quantile of c ⊤ X as η X (p; c), it follows that: η X (p; c) ≤ c ⊤ v(c), i.e. η X (p; c) is not larger than c ⊤ v(c).Therefore: It follows that the scalarization of VMAVaR results in a smaller value of the Average Value-at-Risk of the scalarized random vector, which is one of the systemic risk measures following the constructions in section 3.
We do not pursue further investigation on set-valued systemic measures of risk as their calculation is numerically very expensive.
5 Two-stage stochastic programming problem with systemic risk measures Our goal is to address a situation, when the agents cooperate on completing a common task and risk is associated (among other factors) with the successful completion of the task.This type of situations are typical in robotics, as well as in energy systems, where the units cover the energy demand in certain area.

Two-stage monotropic optimization problem with a systemic risk measure
In this section, we consider how the proposed approaches to evaluate systemic risk can be applied to a two-stage stochastic optimization problem with a monotropic structure.Specifically, we focus on a problem formulated as follows: where Q(x; ξ) has realizations Q s (x; ξ s ) defined as the optimal value of the second-stage problem in scenario s ∈ S: Here f : R n → R is a continuous function that represents the cost of the first-stage decision x ∈ R n and X ⊂ R n is a closed convex set.The random vector ξ comprises the random data of the second-stage problem.In the second-stage problem, we would like to minimize the sum of m cost functions g i : R l × R p → R for i = 1, . . ., m that depend on two second-stage decision variables: local decision variables y i ∈ R l for i = 1, . . ., m and the common decision variable z ∈ R p .The decision variables y i ∈ R l are local for every i = 1, . . ., m, and the local constraints are represented as a closed convex set Y s i ⊂ R l .The decision variable z ∈ R p is common for all i and needs global information to be calculated.The matrix B s is of size d × p and the set D s ⊂ R d is a closed convex set.Note that the constraints (22) linking the first-stage decision variable x and the local second-stage decision variables y i are defined for every i separately, where matrices T s i ∈ R k×n , W s i ∈ R k×l and h s i ∈ R k depend on the scenario s.The constraint ( 23) is a coupling constraint that links the local decision variables y i , where A s i ∈ R d×l and b s ∈ R d depend on the scenario s ∈ S.
We define the total cost as the aggregation of the individual cost functions g i using some scalarization vector c ∈ R m + such that m i=1 c i = 1 and we would like to develop a numerical method to solve the two-stage problem in a distributed way.Specifically, we use decomposition ideas based on the risk-averse multicut method proposed in [15] and the multi-cut methods in risk-neutral stochastic programming to solve the two-stage problem, but we also decompose the second-stage problem into m subproblems that can be solved independently in order to allow for a distributed operation of m units (agents).First, we discuss how to apply the decomposition method to solve the two-stage problem.We use the multicut method to construct a piecewise linear approximation of the optimal value of the second-stage problem and we approximate the measure of risk by subgradient inequalities based on the dual representation of coherent risk measures ϱ[Q] = sup µ∈Aϱ ⟨µ, Q⟩.To this end, we introduce auxiliary variable η ∈ R, which will contain the lower approximation of the measure of risk.Further, we designate Q the random variable with realizations q s which represent the lower approximations of the function Q s (•, ξ s ).Then the master problem in our method takes on the following form: The optimal value ηt contains the value of the approximation of ϱ[Q(x t ; ξ)], where xt is the solution of the master problem at iteration t.Notice that the approximation with µ τ being the probability measures from A ϱ calculated as subgradients in the previous iterations.We shall explain how the subgradients µ τ are obtained in due course.The value qs,τ is the optimal value of the second-stage problem in scenario s at iteration τ and g s,τ is the subgradient calculated using the optimal dual variables of the constraints (22).One can solve the second-stage problem where the objective function consists of a scalarization of m cost functions, but we would like to decompose the second-stage problem into m subproblems Q s i that can be solved independently in a distributed manner.
Consider the second-stage problem Q s (x; ξ s ) for a fixed first-stage decision variable x ∈ R n .To decompose the global problem into m local subproblems, we need to handle two problems: (i) distribute the common decision variable z ∈ R p to individual subproblems i; (ii) decompose the coupling constraints.The common decision variable z can be distributed to subproblems by creating its copy z i for every i, where i = 1, . . ., m.Then we ensure the uniqueness of z by enforcing the decision variables z i to be equal to each other.Then the second-stage problem Q(x; ξ) can be rewritten as: s.t.
In order to distribute the coupling constraints ( 29), (30), we can apply Lagrange relaxation using Lagrange multipliers λ s ∈ R d and µ s ∈ R m×m .Then the global augmented Lagrangian problem Λ s κ0 associated with the second-stage can calculate the optimal value of the objective function Qs i for every subproblem i.Then the global objective function of the second-stage problem can be calculated as Qs (x; ξ s ) = m i=1 Qs i (x; ξ s ) − ⟨ λs , b s ⟩.Once the second-stage problem is solved for every scenario s, we construct objective cuts for every scenario s ∈ S defined as: where g s,k is the subgradient of Q s (x; ξ s ) at x = x k and scenario s ∈ S. Now note that Hence, at x = x k , the subgradient for scenario s ∈ S can be calculated as ∂Q s (x k ; ξ s ) = m i=1 ∂Q s i (x k ; ξ s ).The subgradient ∂Q s i (x k ; ξ s ) is given as −(T s i ) ⊤ π s i , where π s i is the Lagrange multiplier associated with the constraint (28) in subproblem i.Then the proposed method for solving the two-stage problem is formulated as follows: Step 0. Set t = 1 and define initial µ 0 ∈ A ϱ .
Step 1. Solve the master problem (26) and denote its optimal solution as (x t , η t , q t ).
(b) Given the Lagrange multipliers λ s,1 , µ s,1 and decision variables of the neighboring nodes y s,l , z s,l , every node i calculates its optimal solution (ŷ s,l i , ẑs,l i ) by solving its local problem: min (c) Every node i updates its primal variables: are satisfied, then calculate the following quantities and go to Step 3: where π s,l i is the optimal Lagrange multiplier associated with the constraint (33) in subproblem i and Λs,i,l κ0 is the optimal value of the objective function (35).If any of the constraints (36) are not satisfied, update their Lagrange multipliers as follows: i,j = µ s,l i,j + κ s 0 κ s (z s,l i − z s,l j ) Increase l by one and return to Step (b).
Step 4. If ϱ t = η t , stop; otherwise, increase t by one and go to Step 1.
Note that the penalty parameter κ s 0 can be chosen for every scenario s ∈ S according to the structure of the problem.The ADAL method converges to the optimal solution in scenario s if the penalty parameter κ s 0 ∈ (0, 1 q s ), where q is the maximum number of nonzero rows in matrices A s i for i = 1, . . ., m.Hence, κ s 0 can be chosen close to 1 q s .using appropriate matrices B s i , C s i for every node i ∈ J , where 1 is a vector of all ones.Note that since nodes can share information only with the neighbors, one can enforce the equality of the proportion variable between neighboring nodes and rewrite the constraint (42) as follows: where N s (i) is the set of nodes within communication range of node i ∈ J in scenario s ∈ S. If the network is connected, constraint (46) enforces all x i to be equal to each other and ensures the consistency and uniqueness of x i .
We assume that the team of robots works on a square map given by the points with relative coordinates (0, 0) and (2, 2).The spatial distribution of available information to be gathered follows a normal distribution with an expected value C = (0.5, 1.75) in the upper left corner of the map.The network consists of 50 robots and 4 potential locations of the reporting points.We generated 200 scenarios for different spatial configurations of the robots.The four potential locations for the reporting points are fixed in the positions (0.5, 0.3), (1.5, 0.25), (1.75, 0.5), (1, 0.2).The rate function R s ij depends on the distance between the nodes in the network and is defined as: if ∥d s ij ∥ > u, where ∥d s ij ∥ is the distance between the nodes in scenario s ∈ S. We set ℓ = 0.3 and u = 0.6, and values a, b, c and e are chosen so that R s ij is a continuous function.This function is commonly used in literature, see e.g.[32].The information r s i gathered by robot i in scenario s, depends on the robot's position relative to the expected value C given above.In our experiments r s i is calculated as follows: where d i is the positions of robot i ∈ J , w is a scaling factor, and Σ is a covariance matrix, which keep fixed for all experiments.The optimal routing decisions between nodes when the risk is applied to the total loss of all nodes.Blue nodes are selected and red nodes are not selected.(c) The optimal routing decisions between nodes when the individual risks of the nodes are aggregated.Blue nodes are selected and red nodes are not selected.
Comparison of aggregation methods.We solved the optimization problem using two different aggregation methods: • aggregate first Using the proposed multivariate measures of risk, we aggregate the individual losses of the robots with a fixed scalarization w, we calculate for each scenario V s = i∈J w i q s i and evaluate its risk ϱ[V ] by several scalar-valued measures of risk; • evaluate first We evaluate the individual risk of every robot across all scenarios and calculate V i = ϱ i [q i ].
Then we aggregate their values ϱ S [V ] using two examples of nonlinear aggregation shown in section 4.2.1.We solve the problem using a linear scalarization vector w with equal weights w i = 1 J for all i ∈ J , c = [0.8,0.2] and AVaR α (•) for three values of α = 0.1, 0.2, 0.3.
The setup of the communication network problem and the optimal solutions in one of the scenarios for both methods are shown in Fig. 1.One can notice that depending on what kind of aggregation method is used, the set of optimal reporting points might be different.The distribution of the proportion x of information delivered to the reporting points for two methods is shown in Fig. 2. It can be seen that more information is delivered to the reporting points if we aggregate the losses of robots and evaluate the risk.This observation is also reflected in the values of the risk for both methods: imposing a risk measure on linear scalarization of the individual losses results in smaller values than aggregation of individual risks.
Using the optimal values of the decision variables, we can calculate MAVaR and VMAVaR to compare their values with the AVaR applied on linear scalarization of the random cost.The following formulas were used to calculate the values: The values of AVaR, MAVaR and VMAVaR are shown in Table 1.It can be seen that AVaR α (V ) results in smaller values than MAVaR α (V ) and VMAVaR α (V ) at all confidence levels α as it was shown theoretically in section 4.Those measures of risk are computationally very demanding and not amenable to the type of decision problems, we are considering.Hence, we only compare their values for the decision obtained via our proposed method.
When we solve the problem in a distributed way, we use a smaller network consisting of 20 robots and 4 reporting points in a 1.5 by 1.5 square over 100 scenarios.It is assumed that the network is connected in all possible scenarios, that is, every node has at least one neighbor within the communication range, and all nodes are connected to the reporting points through multiple hops.This assumption is necessary for the proper calculation of the proportion of information delivered to the reporting points.If one of the nodes is isolated from the network, the rest of the group converges to a solution that does not take into account the isolated node's contribution.The problem is solved in both centralized and distributed ways, and the results for one of the scenarios are shown in Fig. 3.As it can be seen in Fig. 3 (b), nodes converge to the centralized solution of the proportion of information delivered to the reporting points.
X⟩, then ϱ is positively homogeneous as a support function of a convex set.(iii)Suppose the translation property is satisfied, i.e. ϱ[X + tI] = ϱ[X] + tϱ[I] for any X ∈ Z and a constant t ∈ R. Then for any k ∈ R and ζ ∈ Z * , we get: implies that ζ ∈ ∂ϱ[0].On the other hand, if ζ ∈ ∂ϱ[0], then ζ ∈ A by the definition of a support function.

Theorem 3 .
If ϱ : L p (Ω, F, P ) → R ∪ {+∞} is a coherent (convex) risk measure, then for any set S ⊂ S + m , the risk measure ϱ S [X] = ϱ[X S ] is coherent (convex) according to Definition 1. Proof.For two random vectors X, Y ∈ L p (Ω, F, P ; R m ) with X ≤ Y component-wise a.s., we have c ⊤ X ≤ c ⊤ Y a.s.for all c ∈ S + m .This implies that max c∈S c ⊤ X ≤ max c∈S c ⊤ Y a.s.and, hence, ϱ[X S ] ≤ ϱ[Y S ].Thus, the monotonicity axiom is satisfied.Given two random vectors X, Y ∈ Z and α ∈ [0, 1], consider their convex combination αX + (1 − α)Y .Due to the convexity and monotonicity of ϱ[•], we have

Figure 1 :
Figure 1: Communication network of 50 robots and 4 reporting points in one scenario.The source is located in the upper left corner.The lighter color (yellow) and darker color (purple) indicate higher and lower rates of information generation, respectively.(a) The initial spatial configuration of robots (green) and the reporting points (blue).The lines show communication links and their thickness indicates the rate of connection R ij between nodes i and j. (b)The optimal routing decisions between nodes when the risk is applied to the total loss of all nodes.Blue nodes are selected and red nodes are not selected.(c) The optimal routing decisions between nodes when the individual risks of the nodes are aggregated.Blue nodes are selected and red nodes are not selected.

Figure 2 :
Figure 2: (a) Proportion of information delivered to the reporting points using Mean-AVaR at α = 0.1.(b) Proportion of information delivered to the reporting points using Mean-Upper-Semideviation of order 1.(c) Comparison of the risk values for two aggregation methods.