Probabilistic Constrained Optimization on Flow Networks

Uncertainty often plays an important role in dynamic flow problems. In this paper, we consider both, a stationary and a dynamic flow model with uncertain boundary data on networks. We introduce two different ways how to compute the probability for random boundary data to be feasible, discussing their advantages and disadvantages. In this context, feasible means, that the flow corresponding to the random boundary data meets some box constraints at the network junctions. The first method is the spheric radial decomposition and the second method is a kernel density estimation. In both settings, we consider certain optimization problems and we compute derivatives of the probabilistic constraint using the kernel density estimator. Moreover, we derive necessary optimality conditions for the stationary and the dynamic case. Throughout the paper, we use numerical examples to illustrate our results by comparing them with a classical Monte Carlo approach to compute the desired probability.


Introduction and Motivation
In this paper, we present a method which describes how to deal with uncertain loads in the context of flow networks.The modeling and simulation of flow networks like gas flow, water flow and the diffusion of harmful substances inside flow networks become more and more important.So in this paper, we analyze the gas flow through a pipeline network in a stationary setting and the contamination of water in a dynamic setting.The aim of this paper is to get necessary optimality conditions for chance constrained optimization problems in the context of flow networks.
Gas transport resp.general flow problems have been a goal of many studies.In general, such a flow problem is modeled as a system of hyperbolic balance laws based on e.g. the isothermal Euler equations (for gas flow, see e.g.[4,5,17,21,26,27]) or the shallow water equations (for water flow, see e.g.[6,10,23,24,35]).The model in the stationary setting in this paper is based on the stationary isothermal Euler equations for modeling the gas flow through a pipeline network.In [12,34] one can find a great overview about the topic of gas transport, existing models, and network elements.The existence of a unique stationary state is shown in [21], the stationary states for real gas are analyzed in [26,30].The existence of solutions for the dynamic case has been analyzed in [28,29].Optimal control problems in gas networks have been studied e.g. in [7,9,22].For the problem of contamination of water by harmful substances, we use a linear scalar hyperbolic balance law.This has also been analyzed in [16,20].
An important aspect of this paper is that we consider random boundary data.In the context of gas transport, that means that the loads (i.e. the gas demand) are random.In the context of water contamination, that means that the contaminant injection is random.This leads to optimization problems with probabilistic constraints (see e.g.[38,41]).We also assume box constraints for the solution of the balance law at the network nodes and we define the set of feasible loads M as all loads, for which the solution of the balance law meets these box constraints.Our aim is to compute the probability for a random load vector to be feasible, i.e. we want to compute the probability for a random vector to be in a certain set M .So we identify the load vector with some random vector ξ on an appropriate probability space (Ω, A, P) and we want to compute the probability which we write as P(ξ ∈ M ).
A direct approach to compute this probability is to integrate the probability density function of the balance law solution over the box constraints.However this density function may not be known.We use a kernel density estimator (see e.g.[36,37,40]) to obtain an approximation of this function.In [13,31] one can get a great overview about the area of nonparametric statistics.The authors in [17,27] use the spheric radial decomposition (see e.g.[1,2,14,17,18]) for a similar gas flow problem to compute the desired probability.
Theorem 1. (spheric radial decomposition, see [17], Theorem 2) Let ξ ∼ N (0, R) be the n-dimensional standard Gaussian distribution with zero mean and positive definite correlation matrix R.Then, for any Borel measurable subset M ⊆ R n it holds that where S n−1 is the (n−1)-dimensional sphere in R n , µ η is the uniform distribution on S n−1 , µ χ denotes the χ-distribution with n degrees of freedom and L is such that R = LL (e.g., Cholesky decomposition).
From now on we use SRD instead of spheric radial decomposition and KDE instead of kernel density estimator.The difference in both methods is shown in Figure 1.An advantage of the SRD is that we exploit almost all information we can get from the model.Thus the only numerical error occurs while approximating the spherical integral.The big disadvantage of the SRD is that we need to know an analytical solution of our model which we cannot always guarantee.Therefore we introduce a KDE, which estimates the probability density function of a random variable by using a sampling set of the variable.[19]) Let y be a n-dimensional realvalued random variable with an absolutely continuous distribution and probability density function with respect to the Lebesgue-measure.Moreover, let Y = {y S,1 , • • • , y S,N } be independent and identically distributed sampling of y.Then, the kernel density estimator N : R n → R ≥0 is defined as

Definition 2. (kernel density estimator, see
with a symmetric positive definite bandwidth matrix H ∈ R n×n and a n-variate density K : R n → R ≥0 called kernel.
We apply the kernel density estimation to the balance law solution.If an analytical solution of the model is not known, we can compute the solution numerically and use a sampling set of approximated solutions.Then the desired probability can be computed by integrating the kernel density estimator over the box constraints.That means, we get an approximation error but we can analytically work with numerical solutions of our model.A KDE approach was used in [8] for solving chance constrained optimal control problems with ODE constraints.But was neither used in optimal control problems with PDE constraints and random boundary data nor in the context of continuous optimization with hyperbolic balance laws on networks.Throughout this paper, we illustrate the idea of the KDE in both, the stationary and the dynamic case, such that we can state necessary optimality conditions for optimization problems with probabilistic constraints.This paper is structured as follows: In the next section, we consider stationary gas networks, similar to [17,27].We first consider a simple model on a graph with only one edge to explain the ideas of the SRD and the KDE.We compare both results in a numerical computation with a classical Monte Carlo method (All numerical tests have been done with MATLAB ® , version 2015a).Next we use both methods, the SRD and the KDE, to solve a model on a general tree-structured graph and again we compare both methods with a classical Monte Carlo method.Finally, we state necessary optimality conditions for probabilistic constrained optimization problems related to our stationary model.
In Section 3, we consider a dynamic water network, in which the contaminant injection occurs at the boundaries.We consider a general linear hyperbolic balance law, which models the diffusion of harmful substances on a linear graph in order to discuss probabilistic constraints in the time dependent case and, whether the SRD can be expanded to this case.We also use the KDE for this model.Finally, we state necessary optimality conditions for probabilistic constrained optimization problems related to our dynamic model.

Gas Networks in a Stationary Setting
In this section, we consider stationary states in gas networks.As mentioned before, the model here is similar to the model in [17,27].The network is described by a connected, directed, tree-structured graph G = (V, E) (i.e. the graph does not contain cycles) with the vertex set We assume that the graph has only one inflow node v 0 and that the other nodes are outflow nodes.Let node v 0 be the root of the graph orientated away from the root.The graph is numbered from the root v 0 using breadth-first search or depth-first search.Every edge e ∈ E represents a pipe with positive length L e .For x ∈ [0, L e ] we consider the stationary semi-linear isothermal Euler equations for horizontal pipes and ideal gases With p e = p e we represent the restriction of the pressure defined over the network to a single edge e.Here, q e is the flow along edge e and c e , λ e , D e ∈ R >0 denote the sound speed, the friction coefficient and the pipe diameter.The parameters R S and T denote the specific gas constant of natural gas and the (constant) temperature.Note that q e is constant on every edge.With q e ≥ 0 we denote that gas flows along the orientation of edge e and with q e ≤ 0 we denote that gas flows against the orientation of edge e.We consider conservation of mass for the flow at the nodes (cf.Kirchhoff's first law).Let E + (v) resp.E − (v) be the set of all outgoing resp.ingoing edges at node v ∈ V. Let b v ∈ R be the load at node v ∈ V.With b v ≥ 0 we denote that gas leaves the network at node v (exit node) and with b v ≤ 0 that gas enters the network at node v (entry node).The equation for mass conservation for every node v ∈ V is given by Let p i denote the pressure at the node v i for i = 0, • • • , n.We assume continuity in pressure at every node, i.e. for all v i ∈ V it holds Therefore the pressure p i is defined by the pressure p e (L e ) resp.p e (0) with ingoing resp.
outgoing edge e at node v i .We consider (positive) box constraints for the pressures at all In addition, we impose pressure p 0 at node v 0 .So for the full graph, we consider the following model: ∀e ∈ E, Our aim is now to find loads 6) has a solution.We do not consider the load b v 0 at the inflow node v 0 , because equation (3) provides the relation b There exists a solution of (6) as the set of feasible loads.To go one step further we assume that b is random.This is motivated by reality.Because of the liberalization of the gas market (see [33]), the gas network is independent of the costumers and the gas companies.That means a gas network company must guarantee that the required gas can be transported through the network, but the company does not know the exact amount of gas a priori, so it can be seen as random.We assume b ∼ N (µ, Σ) with mean value µ ∈ R n + and positive definite covariance matrix Σ ∈ R n×n on an appropriate probability space (Ω, A, P).Motivated by the application, µ and Σ are chosen s.t. the probability that b takes positive values is almost 1.In [17] and in [34], Chapter 13, the authors explain why a multivariate Gaussian distribution is a good choice for the random load vector.So we want to know the probability that for a Gaussian distributed load vector b, the model (6) has a solution, i.e.

P(b ∈ M ).
In the next subsections we will use two different approaches for computing this probability.First, to explain the ideas of our approaches we consider a graph with only one edge and later we use the introduced tree-structured graph with one input.

Uncertain Load on a Single Edge
As mentioned before, we will give two different ways to compute the probability for a random load value to be feasible.Here, we consider a single edge as graph.For the boundary conditions p(0) = p 0 and q(L) = b model (2) has an analytical solution.Therefore our problem simplifies to solve the following problem: In fact, that means a random value b ∈ R ≥0 is feasible, if the pressure at the end satisfies the box constraints, i.e.
b ∈ M ⇔ b ≥ 0 and We can rewrite the box constraints for the pressure as inequalities.That means Now we can use the SRD (introduced in Section 1) in an algorithmic way.We consider b ∼ N (µ, σ 2 ) with mean value µ ∈ R + and standard deviation σ ∈ R + .Because the unit sphere in this example is given by {−1, 1}, we need not to sample N points, we can use the unit sphere itself as sampling.For v ∈ S, we set Note that in the multivariate Gaussian distribution, the covariance Σ is given, which contains the variances of the random variables on its diagonal.In the one dimensional case, the standard deviation is given, which is a Cholesky decomposition of the variance.
To guarantee, that b v (r) is positive, we define the regular range Thus we have We insert b v (r) in the inequalities in (8).Since b v (r) ≥ 0 on R v,reg , we can write b 2 v instead of b v |b v |.Thus we have quadratic inequalities in r: 0 ≥ 0. So we can write the set M v as a union of s ∈ N disjoint intervals, i.e.
Since the unit sphere contains only two values and where F χ (•) is the cumulative distribution of the χ-distribution, we can compute the probability for a random vector to be feasible.As mentioned before, the SRD gives us an efficient way to compute this probability, but we need to know the analytical solution of our system (6).
Another way to compute the probability for a random load vector to be feasible is the kernel density estimator.It is more general and does not require the analytical solution of our model.We consider the stochastic equation corresponding to (2) with random boundary condition b and coupling conditions (3), (4) which has also a solution P-almost surely.Hence the pressure at node v 1 is a random variable which we denote with p 1 .The probability that a random load is feasible is equal to the probability that the pressure p 1 is in the interval [p min , p max ].We assume that the variance of p 1 is positive and that its distribution of the pressure p 1 is absolutely continuous with probability density function p .Now, the probability P(p 1 ∈ [p min , p max ]) can be computed by integrating the probability density function p over the pressure bound, so we get If the exact probability density function p is not known, we can approximate the function by a kernel density estimator.Then we integrate this estimator over the interval [p min , p max ] to get an approximation of the probability P(b ∈ M ).
)} ⊆ R be the pressures at the end of the edge for the different loads b S,i ∈ B (i = 1, • • • , N ), which are also independent and identically distributed.We use the KDE in one dimension with bandwidth h ∈ R + and with a Gaussian kernel (see e.g.[13,40]) to get an approximation of the density function ρ p : Remark 3. The choice of the bandwidth h is a separate topic.We refer to [13], Chapter 8 and [31], Chapter 3 for studies about optimal bandwidths for KDEs.Here we use the heuristic formula for the bandwidth given by where σ N is the standard deviation of the sampling and N is the number of samples.The idea is, that we compute the bandwidth depending the variance of the sampling.This is stated and explained e.g. in [19], Chapter 4.2 and in [42].
For the previous bandwidth it holds (h + (N h) −1 ) → 0 P-almost surely for N → ∞.Therefore, [11] (Chapter 6, Theorem 1 ) provides the L 1 -convergence of the estimator: Thus, for an appropriate choice of the bandwidth h, we can use the KDE as an approximation for the exact probability density function of the pressure.From Scheffé's lemma (see [11]), it follows So with (11) we can approximate the probability for a random load vector to be feasible as follows: In this example, we can compute the sampling set P B analytically, because the analytical solution of model ( 2) is known.If this is not the case, e.g. for more complex systems like the stationary full Euler equations or Navier-Stokes equations (see [12]), one can use numerical methods to solve the PDE and to get an approximated sampling set P B .20].For both, the MC method and the KDE approach, we use the same sampling of 5 • 10 4 points.The bandwidth for the KDE is given by (12).The result for 8 tests is shown in Table 2.The sphere of the SRD in this example is finite (S 0 = {−1, 1}), so the SRD gives always the exact probability (except numerical errors due to quadrature) of 82.75%.Also, the probabilities computed by MC and the KDE are quite similar.The mean probability in MC resp.KDE is 82.87% resp.82.40% and the variances are 0.0238 resp.0.0218, which is very close to the (exact) result of the SRD.Thus, one can see, that the KDE provides a suitable method to compute the desired probability and it would get even better using the optimal bandwidth (see Remark 3).

Uncertain Loads on Tree-structured Graphs
For this subsection, we consider a tree-structured graph (i.e. the graph does not contain cycles) G = (V, E) with one entry v 0 , as introduced in the beginning of Section 2. Let the graph be numbered from the root v 0 by breadth-first search or depth-first search and let the model ( 6) holds on every edge.
First we rewrite the system (6) using the solution of the isothermal Euler equations.We use the incidence matrix A + ∈ R (n+1)×n of the graph.For an edge e ∈ E, which connects the nodes v i and v j starting from node v i , we have A formal definition of the incidence matrix can be found in e.g.[17,27,25].We set A ∈ R n×n as A + without the first row (which corresponds to the root resp.the only entry of the graph) and we define the function where Φ ∈ R n×n is a diagonal matrix with the values φ e = λ e (c e ) 2 D e (R S T ) 2 L e (e ∈ E) at its diagonal.The product on the right has to be understood component-by-component.The i-th component of this function states the pressure loss from the root v 0 to node v i .With this function we get the following equivalence for feasible loads: there exists a solution of ( 6), iff the following system of inequalities holds for all k = 1, • • • , n: Proof.The result follows from [17], Corollary 1, if we set the third inequality is redundant and Lemma 4 is proven.
Proof.The proof is equal to the proof of Theorem 11 in [25] for linear graphs, i.e. graphs without junctions.Here, the proof also works for tree-structured networks, because the pressure at the inflow node is fix.Thus, the inequality, where convexity for tree-structured networks breaks in [25], is redundant here (see Lemma 4).Now we use the SRD to compute the probability for a random load vector to be feasible.For this setting, the SRD approach is explained in detail in [17,27].Let b ∼ N (µ, Σ) with mean value µ ∈ R n + and positive definite covariance matrix Σ ∈ R n×n for an appropriate probability space (Ω, A, P).The next steps are similar to the case of one edge.For a point v of a sampling where L is s.t.LL = Σ.Because we only consider outflows, b v (r) must be positive.We define the regular range The inequalities are quadratic in the variable r and we can write the sets M v as unions of disjoint intervals.Thus, (9) gives us the probability for a random load vector to be feasible.As in the previous subsection, we consider the stochastic equation corresponding to (2) with random load b and coupling conditions (3), ( 4).The n-dimensional random vector p denotes the pressure at the nodes v 1 , • • • , v n .The probability that b is feasible is equal to the probability that the pressure p at the nodes is in the pressure bounds, which we denote by P(p ∈ P max min ) with We assume that the covariance matrix of the random vector p is positive definite and that its distribution is absolutely continuous with probability density function p .Now, it holds However the exact probability density function p is not known.But we can approximate the function using a multidimensional kernel density estimation.Let B = {b S,1 , • • • , b S,N } be independent and identically distributed nonnegative samples of the random load vector b ∼ N (µ, Σ).Then, let These samples are also independent and identically distributed.Note that p j (b is the pressure at node v j and the j-th component of the pressure vector p(b S,i ).We introduce the general n-dimensional multivariate kernel density estimator (see e.g.[19]): with a symmetric positive definite bandwidth matrix H ∈ R n×n and a n-variate density function K : R n → R ≥0 called kernel.We choose the the standard multivariate normal density function as kernel.This kernel can be written as the product K(x) = n i=1 K(x i ), where the univariate kernel K : R → R is the standard univariate normal density function.Let σ N,i denote the positive sample variance of the ith variable.Moreover, let V N denote the diagonal matrix As suggested in [19], we use the bandwidth matrix This choice simplifies the estimator to the following form As mentioned in [39,43] such product kernels are recommended and adequate in practice.But, in some situations using only diagonal bandwidth matrices could be insufficient and then general full bandwidth matrices are needed, e.g.depending on the sample covariance matrix.
In order to show the convergence of the multivariate KDE (18), we transform the random p. Thus we get the transformed sampling set For the approximation of the probability density function of the transformed variable we use the multivariate KDE with the previous settings adapted to the data Y B .Thus we get the bandwidth matrix H = h 2 y I n×n , because the transformed data have unit variance.This leads to a KDE with one single bandwidth h y given by Due to (h y + (N h n y ) −1 ) N →∞ −−−→ 0 P-almost surely, we can apply Chapter 6, Theorem 1 in [11] to the estimator y,N .Thus, it holds where y is the exact probability density function of the random variable y.This density function is given by y according to the transformation.There is also a similar relation between the estimators: . Using the previous relations the L 1 -convergence for the KDE p,N follows: Applying Scheffés lemma (see [11]) yields Thus, the integral of the estimator over the pressure bounds converges P-almost surely against the probability P(p ∈ P max min ).Now, we can use this integral as an approximation for the probability P(b ∈ M ) in (16): This multidimensional integral has some useful properties, which we will need when we derive the necessary optimality conditions.We want to mention again, that we can compute the sampling set P B analytically, because we know the solution of the stationary isothermal Euler equations.Example 2: This short example should illustrate the results of both approaches applied to the tree-structured graph with three nodes and two edges shown in Figure 2. The values (without units) are given in Table 3. From the inequalities in Lemma 4 we get As in the last example, we compare both methods, the SRD and the KDE, with a classical Monte Carlo (MC) method.For the MC method and the KDE approach, we use the same sampling of 1 The sampling of the SRD consists of 1 • 10 4 points uniform distribution of the sphere S 1 .Thus the SRD gives always the same (good) result of 74.95%.Again, the MC method and the KDE approach are quite close.The mean probability in MC resp.KDE is 75.10% resp.74.79% and the variance is 0.0235 resp.0.0215.Thus, also in the two dimensional case, the KDE approach is quite good for computing the desired probability.The computing time in every test is quite reasonable.The computation time of the MC method and the KDE approach needs less than one second, while the SRD needs almost two seconds, but the focus of the implementation was on correctness, not on efficiency.So the computing time of the implementation of course can be improved.

Stochastic Optimization on Stationary Gas Networks
In this subsection, we formulate necessary and sufficient conditions for optimization problems with probabilistic constraints.Both, MC and SRD, give algorithmic ways to compute the probability for a random load vector to be feasible.With a KDE approach, which provides a sufficient good approximation of the probability for sufficient large N , we can get necessary optimality conditions for certain optimization problems with probabilistic constraints.Define the set and let a function be given.For a probability level α ∈ (0, 1) consider the optimization problems and Normally, α is chosen large, s.t.α is almost 1.As mentioned before, our aim here is to write down the necessary optimality conditions in an appropriate way.In Section 2.1 we have shown, that the KDE provides good results for computing probabilistic constraints.
In fact, we formulate the optimality conditions for the approximated optimization problems and Due to the convergence results stated before, the approximated probabilistic constraint converges to the exact probabilistic constraint for N → ∞ and thus, also the optimal solution of (25) resp.(26) converges to the optimal solution of (23) resp.(24).This allows us to use necessary optimality conditions for the approximated problems ( 25) and (26).We define Our aim is now to integrate the kernel density estimator of the pressure at the nodes v 1 , • • • , v n over the pressure bounds.We have Since P max min is a n-dimensional cuboid and p,N (z) is continuous we can use Fubini's Theorem.Thus we have and as the density estimation of the pressure is a product of an exponential function in every dimension, we can exchange the integral and the product.It follows We define and we set τ i,j := ϕ i,j (z j ) and use integration by substitution.Then with ϕ i,j (x) = ( √ 2 h j ) −1 we get This formula contains the Gauss error function (see e.g.[3]): We insert the Gauss error function in the previous integral term and we obtain Now we consider the problem (25).For α ∈ (0, 1) we define Thus we have We compute the partial derivatives of g α .For k ∈ {1, • • • , n} we have and with the Gauss error function (27) it follows Then, the k-th component of the gradient ∇g α (p max ) ∈ R n is given by (29).Note that for b ), the partial derivatives in (29) are negative for all p max ∈ R n .Remark 6.Let pmax ∈ R n with g α (p max ) = 0 be given.Then it is ∇g α (p max ) 1 n < 0 and thus, the Mangasarian-Fromovitz constraint qualification (MFCQ) holds in pmax .

Now we can state necessary optimality conditions for the optimization problem (25):
Corollary 7. Let p * ,max ∈ R n be a (local) optimal solution of (25).Since the MFCQ holds in p * ,max , there exists a multiplier µ * ≥ 0, s.t.
Thus, (p * ,max , µ * ) ∈ R n+1 is a Karush-Kuhn-Tucker point.Now we consider problem (26).We slightly change the notation to add the explicit dependence on p 0 , so we write p(b S,i , p 0 ) (i = 1, • • • , N ) instead of p(b S,i ) for the samples in the set P B .We redefine the function ϕ i,j and we define the constraint of (26) as Thus we have For the derivative with respect to p 0 , it follows We can exchange the integral and the derivative, thus we have We define τ i,j := ϕ i,j (z j , p 0 ) and since p k (b i , p 0 ) is independent of z k , it follows The second integral, we can solve analytically.Since Note that in the setting of the stationary gas networks, it is .

Remark 8. The term
) is not necessarily positive, thus the derivative with respect to p 0 must not necessary be negative.Since (26) has only one constraint, the linear independence constraint qualification (LICQ) holds.Corollary 9. Let p * 0 ∈ R be a (local) optimal solution of (26).Since the LICQ holds in p * 0 , then there exists a multiplier µ * ≥ 0, s.t.
If the objective function f is strictly convex and the feasible set is convex, then all necessary conditions stated here are sufficient.In this case, Corollary 7 and Corollary 9 give a characterization of the (unique) optimal solution.

Dynamic Flow Networks
In this section, we extend the in Section 2 introduced methods to dynamic systems.We first discuss probabilistic constraints in a dynamic setting and time dependent random boundary data.Then we consider a model, which we can solve analytically to use the idea of the SRD for dynamic systems and compare it with the idea of the KDE.Last we also formulate necessary optimality conditions for optimization problems with probabilistic constraints in a dynamic setting.

Time Dependent Probabilistic Constraints and Random Boundary Data
In Section 2, we computed the probability for a random vector to be feasible.So if we fix a point in time t * ∈ [0, T ], we can use a similar proceeding.But if we do not fix a point in time, we need an extension to the probabilistic constraint how it has to be understood for a time period.For a time dependent uncertain boundary function b(t) and a (time dependent) feasible set M (t), a possible formulation (the one, that we will use later) for the probabilistic constraint is That means, we want to guarantee, that a percentage α of all possible random boundary functions (in an appropriate probability space (Ω, A, P)) is feasible in every point in time t ∈ [0, T ].This is a very strong condition.Another possibility is which means, that a random boundary function must be feasible with a percentage of α in every time point t ∈ [0, T ].For our applications, this might not make sense, because we want to guarantee that problems for a costumer only occur in worst case scenarios.This probabilistic constraint only states, that even in these worst case scenarios, the problems for a costumer stay small, but these small problems can occur in every point in time.A third possibility for the time dependent probabilistic constraint is an average formulation: 1 That means the average probability during the time period [0, T ] must be large enough.This formulation might make sense in other applications, but not for the flow problems which are considered here (with the same argument as before).Thus we use the formulation (30) for time dependent probabilistic constraints.
Next we discuss the uncertain boundary data.For a random boundary data, we use a representation as Fourier series as it is done in [15].So for a deterministic boundary function b D : [0, T ] → R with b D (0) = 0 and for m = 0, 1, 2, • • • , we define the orthonormal series and the coefficients Then we can write the boundary function b D (t) in a series representation Now for m ∈ N 0 , we consider the Gaussian distributed random variables a m ∼ N (µ, σ 2 ) for a mean value µ ∈ R and a standard deviation σ ∈ R + on an appropriate probability space (Ω, A, P).Then we consider the random boundary data Since the random variables a m are all independent and identically distributed, we can use the fact that for b D ∈ L 2 ([0, T ]), it is also b ∈ L 2 ([0, T ]) P-almost surely (see [15,32]).
Remark 10.For the numerical tests, we truncate the Fourier series after N F ∈ N terms.Thus we use instead of (33) for the implementation of b D .Because it holds lim this truncated Fourier series is a sufficient good expression for b D (t) for N F large enough.Similarly we use (34) as random boundary data for the implementation.

Deterministic Loads for a scalar PDE
For (t, x) ∈ [0, T ] × [0, L] and constants d < 0, m ≤ 0, we consider the deterministic scalar linear PDE with initial condition and boundary condition Here, r is the concentration of the contamination.The term dr x describes the transport of the contamination according to the water flow and the term mr describes the decay of the contamination.This equation e.g.models the flow of contamination in water along a pipe or in a network (see [20,16]).Assume C 0 -compatibility between the initial and the boundary condition.We will specify the boundary condition later.We state b(t) ≥ 0, if the water gets polluted and b(t) < 0 if the water gets cleaned.
For initial data r 0 ∈ L 2 ([0, L]) and boundary data b ∈ L 2 ([0, T ]), a solution of ( 35) is in C([0, T ], L 2 ((0, L); R)) and it is analytically given by Every edge e i ∈ E has a positive length L i .Linear means here, that every node has at most one outgoing edge (see Figure 3).For a formal definition see [25].The equation (35) holds on every edge.We assume conservation of the flow at the nodes, i.e.
where r i denotes the contamination concentration on edge e i and b i denotes the boundary data at node v i .For constants d k < 0, m k ≤ 0, the full model can be written as follows (with The model can be interpreted as follows: The graph represents a water network, where the water is contaminated at the nodes v i (i = 1, • • • n).This contamination is distributed in the graph in a negative way (due to d k < 0).We want to know the contamination rate at node v 0 .Later, we assume the contamination rate at the nodes to be Gaussian distributed.Then for a time t * ∈ [0, T ], we want to compute the probability for the contamination rate at node v 0 to fulfill box constraints using both, the SRD and the KDE.In both cases, we also consider the general time dependent chance constraints discussed before.The next lemma states an analytical solution for the model (36).
For the case We want to mention, that if the initial and boundary conditions (including coupling conditions) satisfy stronger compatibility conditions, the solution has better regularity properties.Remark 13.In the analytical solution of (36) we only distinguish if the solution depends on the initial or the boundary condition, depending on the time and the location in the pipe.So for the solution of edge 4 in Figure 4, we distinguish between That means, at the beginning of edge 4 (for x = 0), for 0 ≤ t ≤ − L 4 d 4 , the solution of edge 4 depends on the initial condition of edge 4. From this, it follows, that the solution of It depends on the boundary conditions of edge 3 and the initial condition of edge 4 (due to the coupling condition) for and it depends on the boundary conditions of edge 3 and edge 4 for This leads to the differentiation in Theorem 11 in the case Remark 14.With the result of Theorem 11 one can also derive analytical solutions of (36) for tree-structured graphs, but one has to take into account, that the flow at the end of an edge (due to coupling conditions) can depend on more than one outgoing edges.That means the solution on a tree-structured graph is basically the sum over all paths of the solution stated in Theorem 11.
Proof of Theorem 11.We define the following functions: We consider the k-th edge in a linear graph with n edges (k and So the PDE in the system (36) holds in the marked area in Figure 5 (a). and and the PDE in system (36) also holds in the marked area in Since sums from k to k − 1 are equal to 0, this leads to γ k,k (0, x) = 0 and δ k,k (0, x) = x.Thus the initial conditions are fulfilled (see Figure 6 (a)).
Figure 5: Areas in which the PDE of system (36) holds For checking the boundary condition we consider (i.e.k = n and x ≥ d n t + L n ), we have Finally, we have to check the coupling conditions.For and the coupling conditions are fulfilled (see Figure 7 (a)).
So the coupling conditions also hold in this case (see Figure 7 (b)) and the lemma is proven.
Figure 7: Areas in which the coupling conditions of (36) hold In the next section, we consider model (36) with uncertain boundary data.

Stochastic Loads for a scalar PDE
The model (36) describes the distribution of the water contamination in a network which is contaminated by the costumers at the nodes except v 0 .At the end of the network (at node v 0 ), there are restrictions on the contamination rate.But because the contamination rate at the nodes cannot be known a priori, it can be seen as random.Of course one can expect a certain value from statistics or measurements, but this value is never exact.Therefore we use the random boundary data in a Fourier series representation, which was introduced before.For m ∈ N 0 , we consider the Gaussian distributed random variables a m ∼ N (µ, Σ) with mean value µ ∈ R n + and positive definite covariance matrix Σ ∈ R n×n on an appropriate probability space (Ω, A, P).Then the random boundary data at node with coefficients and ψ m defined in (31).Note again, that for the implementation, we cut the series after So the full model in this subsection is given in (36).For this model, we define the set of feasible loads as Our aim in this subsection is, for a time t * ∈ [0, T ], to compute the probability which is the probability, that for a random boundary function b ∈ L 2 ([0, T ]; R n ≥0 ), the solution of the linear system (36) satisfies the box constraints (37) at a point in time t * ∈ [0, T ].From Theorem 11 we know that where b ω denotes the realization b(ω) of the random boundary data for ω ∈ Ω.For i = 1, • • • , n, we define the (time dependent) values and Then, b is feasible at time t * ∈ [0, T ], iff for t * ≥ − n j=1 L j d j and it is feasible, iff Thus, similar to the stationary case, the (time dependent) one-dimensional sets at a point in time t * can be computed by intersecting the regular range with the inequality (39) resp.(40).If t * ≥ − n j=1 L j d j , then from (39) it follows Define the values and Then we have Define the values and Then we have Thus, for every point in time t * ∈ [0, T ], the set M v (t * ) can be represented as a union of disjoint intervals, which we can use to compute the probability for the random boundary function b to be feasible, like in (9).
Next, we approximate the probability P(b ∈ M (t * )) by using the KDE approach introduced in subsection 2.1.We consider the stochastic equation corresponding to (36) with random boundary data.Note that this equation has also a solution P-almost surely.We assume that the distribution of the random variable r 1 (t * , 0) is absolutely continuous with probability density function r,t * for t * ∈ [0, T ].Thus the point in time t * has to be large enough, so that r 1 (t * , 0) depends on the random boundary data.Similar to section 2.1, it holds In order to approximate the unknown probability density function by the KDE, we need a sampling set for the random variable r 1 (t * , 0).Therefore, let for m = 1, • • • , N F be independent and identically distributed sampling sets of a m ∼ N (µ, Σ) (with mean value µ ∈ R n + and Σ ∈ R n×n positive definite).Let With this and Theorem 11 (or an appropriate numerical method for solving the system (36)), we define the sampling We choose the bandwidth according to (12).Therefore, we get the same convergence results for the KDE resp. the approximated probability as in (13) resp.( 14).So we can approximate the probability for a random boundary function b to be feasible at time t * ∈ [0, T ] by So far, the computation of the desired probability in this subsection was only for box constraints at a certain point in time t * ∈ [0, T ], e.g. the end time t * = T .As mentioned before, we are interested in box constraints for the full time period, which leads to a probabilistic constraint given in (30).The idea is, that r 1 (t, 0) satisfies the box constraints for all t ∈ [0, T ], iff the maximum and the minimum value of r 1 (t, 0) in [0, T ] satisfies the box constraints: We define the values t := argmin t∈[0,T ] r 1 (t, 0), and t := argmax t∈[0,T ] r 1 (t, 0).
For the SRD we use a similar procedure as above.We have That means, we need to intersect the regular range with two inequalities of the form (41) resp.(42) to get the set M v (t) ∩ M v (t) and to compute the desired probability.This is only possible, if one compute t and t for the deterministic boundary function b D (t).Otherwise, due to the randomness of the boundary functions b(t), the argmin and the argmax can be shifted and thus, the t and t depend on this uncertainty.So a general t and t does not exist and it is not clear, where to evaluate the mean and the variance in (41) resp.(42).But if we use the deterministic boundary function, the t and the t does not meet the minimal and maximal values of the random boundary functions and thus, the result may not be significant.
The KDE uses only the solution of the model for estimating an analytical probability density function, which we can use later for the optimization.To extend the KDE to probabilistic constraints like (30), we consider the minimal and maximal contamination concentration at node v 0 in the time period [0, T ].We assume that the distribution of the random vector R := min t∈[0,T ] r 1 (t, 0), max t∈[0,T ] r 1 (t, 0)  (32).The initial function is chosen s.t. the solution is constant at x = 0 for small times, so we guarantee, that all minimal values are below this constant value and all maximal values are above this constant value (see Remark 15).The other values are given in Table 5.For the MATLAB ® implementation, we use a sampling of 1 • 10 5 boundary functions in terms of Fourier series.We compare the probabilities of the KDE again with a classical Monte Carlo method (MC).The MC method checks for a random boundary function, if the bounds are satisfied for every point of the time discretization.If this is not the case, this boundary function is not feasible.The results of the tests are shown in Table 6.One can see, that the results of MC and the KDE are almost equal.The mean probability in MC resp.KDE is 74.38% resp.74.37% and the variance is 0.0141 resp.0.0142.Both methods are still quite fast.MATLAB ® needs much more time (∼ 1 minute) for the sampling and computing the random boundary data than for computing the probabilities, which is less then one second.
In this subsection, we have considered the SRD and the KDE in a dynamic setting based on the results from Section 2. First the box constraints only hold for a certain point in time t * ∈ [0, T ] and then, the box constraints hold for the full time period [0, T ].

Stochastic Optimization on Dynamic Flow Networks
In this subsection, we formulate necessary optimality conditions for the dynamic hyperbolic system introduced before.In the subsection before, we introduced different ways to compute the probability for a random boundary function to be feasible.Using the KDE gives us a good approximation of this probability.Define the set ).
For a probability level α ∈ (0, 1), consider the optimization problem with the approximated probabilistic constraints

Conclusion
In this paper, we have shown two different ways, how to evaluate probabilistic constraints in the context of hyperbolic balance laws on graphs for both, a stationary and a dynamic setting with box constraints for the solution.
The spheric radial decomposition provides a good method for computing the probabilities for random boundary data to be feasible in the stationary case and in the dynamic case for box constraints for a certain point in time.Because the spheric radial decomposition is explicitly based on the analytical solution, it leads to good results.But as soon as the analytical solution is not given, the inequalities cannot be derived and so the spheric radial decomposition becomes an almost purely numerical method.
A kernel density estimator does not explicitly need the analytical solution and provides an estimated, but analytical probability density function, which can be used for computing the probabilistic constraint and for deriving necessary optimality conditions for probabilistic constrained optimization problems.In addition the kernel density estimator is smooth even though the exact probability density function is nonsmooth.The examples showed, that the kernel density estimator provides results almost as good as the results from the spheric radial decomposition and a classical Monte Carlo approach.Of course, the Monte Carlo approach is faster and easier to use, but it cannot be used to get any kind of analytical result like e.g. the necessary optimality conditions.Another advantage of both methods is, that they do not depend on the topology of the graph.The topology of the graph might influence the analytical solution, but the methods themselves work.Further, the idea of the kernel density estimator can easily be used for any kind of partial differential equations with random boundary data, because it only requires a sufficient good solution of the PDE, e.g. by using numerical methods, and this has been a goal of many papers in the last decades.
(a) Direct Approach: Integrate the density function over a certain set (b) SRD: Integrate rays evaluated at the χ-distribution over the unit sphere

Figure 1 :
Figure 1: Computing the probability for a random vector to be in a certain set: Direct approach vs. SRD

3 :
Values for the example with two edges.
This representation of a random boundary function as Fourier series requires b D (0) = 0.If this is not given, i.e. if b D (0) = 0, one can shift b D by b D (0), get the representation as Fourier series and shift this Fourier representation back by b D (0), as we do later in Example 3 .

Remark 12 .
We set t * := n j=1 L j |d j | .For points in time t ≤ t * the solution can depend explicitly on the initial condition.If we assume that |d i | are absolute velocities and L i are lengths, the information from the right boundary needs Ln |dn| seconds to travel along the n-th edge.Then after Ln |dn| seconds, the solution of the n-th edge only depends on the boundary data, but the solution of edge n − 1 can still depend on the initial condition of the n-th edge.A scheme of characteristics for a graph with 4 edges is shown in Figure 4.

Figure 4 :
Figure 4: Characteristics of (36) on a graph with 4 edges

Figure 5 (
b). Next we show, that initial conditions hold.For x < d k t + d k n j=k L j d j and = k, we have

Figure 6 :
Figure 6: Areas in which the initial and the boundary conditions of (36) hold which is a good approximation of b for N F large enough (see Remark 10) and as mentioned before, if (b D ) k ∈ L 2 ([0, T ]; R), then b k ∈ L 2 ([0, T ]; R) P-almost surely.Because water cannot get cleaned at the nodes, we are only interested in positive boundary values (cf.Section 2).Therefore, we assume that b D ∈ L 2 ([0, T ]; R n ≥0 ) and that the parameters µ, Σ of the distribution of a m are chosen s.t. the probability that b ∈ L 2 ([0, T ]; R n ≥0 ) is almost 1.As mentioned before, we want the solution at a time t * ∈ [0, T ] at the node v 0 to satisfy box constraints, s.t.
j ≤ t * < − n j=1 L j d j ( ∈ {1, • • • , n}).Due to the distribution of the random values a m (m = 0, 1, • • • ), it is b ∼ N (µ b (t), Σ b (t)), with µ b (•) ∈ R n + and Σ b (•) ∈ R n×n positive definite.To compute the desired probability for a point in time t * ∈ [0, T ], we use the idea of the SRD.For a point v ∈ S n−1 at the unit sphere, we setb v (r, t) = rL b (t)v + µ b (t) = rπ b (t) + µ b (t), with π b (t) = L b (t)v and L, s.t.L b (t)L b (t) = Σ b (t).Because we are only interested in positive boundary values, we define the regular range as

T
is absolutely continuous with For the computation, we use r 0 (x) = 5 exp m d (x − L) as initial function, b D (t) = −2 sin(2t) + 5 and b(t, ω) = ∞ m=0 a m (ω)a 0 m ψ m (t) + 5 as random boundary function.The coefficients a 0 m for the shifted function b D (t) − 5 are given by

Table 5 :
Values for the dynamic example Further, we use 101 points for the time discretization and we cut the Fourier series of the random boundary data after 30 terms.A sampling of 10 random boundary functions and the corresponding solutions are shown in Figure 9.

(a) 10
random boundary functions (b) r1(t, 0) for the random boundary data

Figure 9 :
Figure 9: Sampling of 10 random boundary functions and the corresponding solutions at node v 0

Example 1 :
We give an example to illustrate that the results of the KDE and the SRD are similar.Therefore we use the values (without units) from Table1.With the inequalities

Table 1 :
(8)ues for the example with one edge.(8)andthevaluesgiven in Table1we can see that b is feasible iff b ∈ [0, √20].For comparing the probability we use a classical Monte Carlo (MC) method, in which we check the percentage number of points inside [0, √

Table 2 :
Results for the example with one edge.

Table 4 :
• 10 5 points.The result for 8 tests is shown in Table 4. Results for the example with two edges.
the solution of (36) at node v 0 at time t * ∈ [0, T ] with boundary function b S,i ∈ B A .The samplings B A and R * are also independent and identically distributed.Then for a bandwidth h ∈ R + , the probability density function r,t * is approximately given by r,t * ,N

Table 6 :
Results for the dynamic example with one edge