Joint Model of Probabilistic-Robust (Probust) Constraints Applied to Gas Network Optimization

Optimization problems under uncertain conditions abound in many real-life applications. While solution approaches for probabilistic constraints are often developed in case the uncertainties can be assumed to follow a certain probability distribution, robust approaches are usually applied in case solutions are sought that are feasible for all realizations of uncertainties within some predefined uncertainty set. As many applications contain different types of uncertainties that require robust as well as probabilistic treatments, we deal with a class of joint probabilistic/robust constraints. Focusing on complex uncertain gas network optimization problems, we show the relevance of this class of problems for the task of maximizing free booked capacities in an algebraic model for a stationary gas network. We furthermore present approaches for finding their solution. Finally, we study the problem of controlling a transient system that is governed by the wave equation. The task consists in determining controls such that a certain robustness measure remains below some given upper bound with high probability.


Joint Probabilistic/Robust Constraints
Decision making problems are more than often affected by parameter uncertainty. An optimization problem dealing with uncertain variables has the typical form min x g 0 (x) subject to g i (x, z) ≥ 0 (i = 1, . . . , k). (1) Here x ∈ R n is a decision vector, z ∈ R m is an uncertain parameter, g 0 : R n → R is the objective function and g : R n × R m → R k is the constraint mapping. The decision support schemes with non-deterministic parameters have to take into account the nature and source of uncertainty while balancing the objective and the constraints of the problem. There are two main approaches for dealing with uncertainty in the constraints of an optimization problem: the first one is the use of probabilistic constraints. This approach is based on the assumption that historical data is available to estimate the probability distribution of the uncertain parameter and thus considering it as a random vector ξ taking values in R m . Then (1) may be given the form (note that the first '≥' sign is to be understood component-wise). Here, a decision x is declared to be feasible if and only if the original random inequality system (1) is satisfied with a prescribed probability p, a level usually chosen close to, but not identical to one. Of course, higher values of p lead to a smaller set of feasible decisions x in (2), hence to optimal solutions at higher costs. Concerning the random variable ξ we essentially focus on continuous ones. For a standard reference on optimization problems with probabilistic constraints we refer to the monograph [22].
An alternative approach is given by robust optimization. It applies when the uncertain parameter u is completely unknown or non-stochastic due to a lack of available data. Then, satisfaction of the uncertain inequality system (1) is required for every realization of the uncertainty parameter within some uncertainty set U ⊆ R m containing infinitely many elements in general, so that one arrives at the following optimization problem: For a basic monograph on robust optimization, we refer to [3]. We notice that both optimization models with probabilistic and robust constraints are deterministic reformulations of (1), since they depend only on the decision vector x but no longer on the outcome of the uncertain parameter z.
A central issue in robust optimization is the appropriate choice of the uncertainty set U . Simple-shaped sets like polyhedra or ellipsoids induce computationally tractability [2] and allow one to deal with much larger dimensions than in the case of probabilistic constraints. Moreover, when choosing U such that P(ξ ∈ U ) = p, then the feasible set of decision variables x of (3) is contained in that of probabilistic programming (2), so that an optimal solution to (3) is a feasible solution to (2). For these two reasons, robust optimization is not only preferred in the absence of statistical data, but also as a conservative approximation of the probabilistic programming setting. This conservatism, however, may be significant up to the point of ending up at very small or even empty feasible sets possibly coming at much higher costs than under a probabilistic constraints. This trade-off propels the use of probabilistic constraints in the presence of statistical information at least in moderate dimension.
Although these approaches, probabilistic programming and robust optimization are often dealt with separately, in many applications, one is faced with uncertain variables of both mentioned types. This leads us naturally to the consideration of uncertain inequalities (2) in which the uncertain variable has a stochastic and a non-stochastic part, i.e., z = (ξ, u). A typical example is optimization of gas transport in the presence of uncertain loads. Historical data are available (stochastic uncertainty) for the exit loads in many situations. On the other hand, observations can hardly be accessed (non-stochastic uncertainty) for entry loads, or for example, uncertain roughness coefficients in pipes. Therefore, it seems natural, to combine the originally separate models (2) and (3). An appropriate way to do so is to formulate a probabilistic constraint (w.r.t. ξ ) involving a robustified (w.r.t. u) infinite inequality system: This class of constraints has been recently investigated in [23] under the name of hybrid robust/chance-constraints and in [10] under the name of probabilistic/robust constraints. For easier reference, we shall be using in the following the natural acronym of probust constraints. We note that even the more complex situation of the uncertainty set depending on the decision and random variable plays an increasing role in applications. Here, the constraint becomes P (g(x, ξ, u) ≥ 0 ∀u ∈ U (x, ξ )) ≥ p, (5) where the inner part resembles constraint sets arising in generalized semi-infinite optimization [25]. We note that yet another form of combining the probabilistic and robust parts of the inequality system would result from interchanging their arrangements in (4): In this way one does not arrive at a probabilistic constraint involving infinitely many random inequalities as in (4) but rather at an infinite system of probabilistic constraints. This setting is related to (robust) first-order stochastic dominance constraints [6] and to distributionally robust probabilistic constraints [26], which currently receives increased attention. We won't deal with this model here but rather focus our considerations on (4) and (5) respectively, which impose stronger conditions in the sense of joint probabilistic constraints compared to individual ones.
The aim of this paper is to illustrate the application of this new class of probust constraints to optimization problems in gas transport under uncertainty in the exit and entry loads. Uncertainty in the roughness coefficients of the pipe is not a part of this paper and it has been analysed for example in [1]. For a recent monograph on gas transport optimization we refer to [18]. We will consider first the problem of maximizing free booked capacities in an algebraic model for a stationary gas network. The corresponding model is presented in Section 2. This overall problem is too complex, however, to be dealt with in this paper. Therefore we will split it into two subproblems, namely capacity maximization at exits (consumer side) which is discussed in Section 3 and capacity maximization at entries (provider side) which is analyzed in Section 4. Without loss of generality, we follow the convention of entry loads being non positive and exit loads being non negative.
While often the research in optimization in finite-dimensional spaces (including mixedinteger nonlinear optimization) and PDE-constrained optimization takes place within disjoint communities, we think that it is important to identify common problem structures that occur in both classes of problems, in particular in the development of methods that allow to take into account uncertainties. Therefore in Section 5 we discuss an optimization problem with a probust constraint for a system that is governed by a PDE that is related to the application in gas transport. The PDE model allows us to consider a transient System in Section 5, whereas in the first two parts of the paper, stationary problems are studied. The transient system is governed by the wave equation under probabilistic initial and Dirichlet boundary data at one end of the space interval. At the other end of the space interval, Neumann velocity feedback is active. For the system a desired stationary state is given. The robustness of the system is measured by the L ∞ -norm of the difference between the actual stateṽ and the desired statev. Due to the definition of the L ∞ -norm (as the essential supremum of the absolute value), this approach gives information about the maximal pointwise distance in space and time. Since our solutions are in fact continuous for appropriate initial and boundary data, the L ∞ -norm is equal to the maximum norm. The robustness requirement is that this pointwise distance remains below a given upper bound v max . In our system, the state depends on uncertain initial and boundary data with a given probability distribution. The meaning of the probabilistic constraint is the following: The probability that the robustness requirement is satisfied is sufficiently large, i.e., greater than or equal to a given value p ∈ (0, 1]. This probabilistic constraint can be written in the form of (4); for details see Section 5, (36). As a numerical example, we consider the optimization with respect to the feedback parameter in Subsection 5.3.

Maximization of Free Capacities in a Stationary Network
We consider a passive stationary gas network given by a directed graph G = (V , E) with a set V of vertices and a set E of edges. We shall assume that the set of nodes decomposes into a set V + of entries, where gas is injected and a set V − of exits, where gas is withdrawn.
Without loss of generality we label the nodes in a way that entries come first and exits last. The gas transport along the network is triggered by bilateral delivery contracts between traders who inject gas at entries and traders covering customer demands by withdrawing gas at exits. An amount of gas injected into or withdrawn from the network at certain nodes is called a nomination. We shall refer to b ≤ 0 and ξ ≥ 0 as to the vectors of nominations at entries and exits, respectively.
Nominations have to satisfy three conditions: 1. At each node (entry or exit) of the network, nominations must not exceed the capacity booked for that node by the respective trader. 2. Nominations must be balanced over the whole network, i.e., the sum of nominations at entries equals the sum of nominations at exits. 3. Nominations must be technically feasible in the sense that there exist pressures within given bounds at the nodes and a flow through the network such that the nominations at the exits can be served by the nominations at the entries.
The first condition has to be satisfied by the traders. Referring to C − , C + ≥ 0 as to the vectors of booked capacities at entries and exits, respectively, it can be written as where the intervals are to be understood in a multidimensional sense. The second condition is an automatic consequence of the collection of bilateral delivery contracts between entries and exits and can be written as where 1 + and 1 − are vectors filled with entries 1 of the respective dimension of entries and exits.
The third condition of technical feasibility of some joint nomination vector (b, ξ ) can be characterized by the existence of vectors q of flows along the edges of the network and π of pressure squares at the nodes satisfying the conditions Here, A is the incidence matrix of the network graph, := diag(( e )) e∈E is a diagonal matrix of roughness coefficients and the modulus sign for a vector has to be understood componentwise. The first two equations in (8) correspond to the first two Kirchhoff's Laws (mass conservation and pressure drop), whereas the interval condition imposes bounds on the pressure. It is actually these bounds that constrain the feasibility of nominations b, ξ , see, e.g., [18]. It is the network owners' responsibility to make sure-without knowledge of concrete bilateral delivery contracts between entries and exits-that condition 3. is satisfied for all nominations fulfilling conditions 1. and 2. This requirement clearly imposes a constraint on the booked capacities C + , C − via (6) saying for all (b, ξ ) satisfying (6), (7) there exists (q, π ) satisfying (8). It can be formally written as: Satisfying (9) in a rigorous way would yield (too) small booking capacities at the nodes of the network. Here, the network owner can benefit from the fact that nominations at the exits (gas withdrawn for consumption) are endowed with a typically large historical data base so that they can be modeled as random vectors obeying some appropriate multivariate distribution. This offers the possibility to relax the 'for all' condition on ξ in a probabilistic sense as to satisfy (9) with sufficiently high probability p. In this way, by choosing p close to one, it is possible to keep a robust satisfaction of technical feasibility while allowing for considerably larger booked capacities. A similar probabilistic modeling of entry nominations would not be justified (although historical data might be available here too) because their outcome is market driven and thus not a genuine random vector.
This motivates the network owner to relax the worst case condition in a probabilistic sense on the side of exits but keeping it on the side of entries. He then arrives at the following probabilistic formulation of feasible booked capacities C + , C − : Here, P refers to a probability measure related with the random vector ξ and p ∈ (0, 1] is a desired probability level chosen by the network owner. The expression on the lefthand side of this inequality provides the probability that a random exit nomination (within booked capacity) combined with an arbitrary entry nomination (within booked capacity and in balance with the exit nomination) is technically feasible. Now, for a given capacity vector (C + , C − ) it may turn out that the associated probability on the left-hand side of (10) is larger than the desired minimum probability p, e.g., 0.96 vs. 0.9. This would give the network owner the opportunity to offer larger capacities while still keeping the desired probability p. Therefore, he might be led to determine the largest possible additional capacities (x + , x − ) he could offer for booking by new clients. This would lead to the following optimization problem: Here, the weight vector w in the objective reflects some preferences the network owner could have in order to offer new booking capacities at different nodes. In the absence of preferences, he could simply choose w := 1. Note, that the nomination vector at exits has been split into ξ and y, where ξ refers to the nominations of already existing clients (thus endowed with historical data and amenable to stochastic modeling) while y refers to nominations of potentially new clients without nomination history. As these can therefore not be treated stochastically, they are considered with a 'for all' requirement similar to entry nominations. No such splitting is necessary on the side of entries because nominations have to be considered there with a 'for all' requirement anyway as they cannot be modeled stochastically in a straightforward manner. In the following section, we shall address in detail the capacity maximization problem for exits only, a restriction which allows us to solve numerically the arising entire optimization problem subject to probust constraints. In contrast, Section 4 will focus on entries only and discuss essential issues related with the solution of this alternatively restricted optimization problem.

Maximization of Booked Capacities for Exits in a Stationary Gas Network
As mentioned in the introduction, the overall problem of capacity maximization (11) is too complex to be dealt with here. Therefore, we shall focus in a first step on maximizing capacities at exits.

The Capacity Maximization Problem for Several Exits and One Entry
In the following we will make the assumption that the network is a tree and that there exists only one entry point serving m exits. The presence of cycles in the network would significantly complicate the numerical solution of the problem. Nonetheless, in Section 3.4, we sketch a possible methodology in the presence of cycles. The restriction to a single entry is made here, in order not to deal with the robust uncertainty related with the splitting of nominations within several entry nodes (see '∀b ∈...' condition in (11)) which will be considered later in Section 4 separately. Without loss of generality, we define the entry to be the root of the network labeled by index '0'. For simplicity, we assume moreover that, the booked capacity C + of the entry is large enough to meet the maximum possible load by all exits as well as possible extensions thereof after adding additional capacity at the exits as a result of an optimization problem. As a consequence of this constellation our general capacity maximization problem (11) reduces to an exit capacity maximization problem of the form Here, the remaining decision variables are just the extensions of exit capacities. Since no capacity extension for the single entry is intended and since its existing capacity is not constrained by our assumption, the corresponding constraint disappears as well as the balance equation which is just substituted in the description of technical feasibility. The resulting optimization problem does no longer contain entry nominations at all but only random exit nominations ξ and deterministic exit nominations y of new clients along with the additionally allocated booking capacities x − . Clearly, the probabilistic constraint in (12) is not yet in the explicit form of the probust constraint (4). This can be achieved in our case, thanks to the network being a tree having the single entry as its root. Note that by this special structure the direction of the gas flow is completely determined. Moreover, by directing all edges in E away from the root, according to [8], a vector z of exit loads in this configuration is technically feasible, if and only if in the notation introduced above, the inequality system is satisfied, where In order to explain the definition of the functions h k above, we denote k l for k, l ∈ V if the unique directed path from the root to k, denoted (k), passes through l. The term H (e) refers to the head of the (directed) arc e ∈ E.
With these specifications, which are fully explicit in terms of the initial data of the problem, we reformulate problem (12) with the aid of inequalities (13) as In [16] it is shown, that the infinite random inequality system (15) can be reduced-using (13) and (14)-to the following finite one For the random vector ξ of stochastic exit nominations we will suppose that it follows a truncated multivariate Gaussian distribution: More precisely, the distribution of ξ is obtained by truncating an m-dimensional Gaussian distribution with mean μ and covariance matrix to an m-dimensional rectangle [0, C − ] representing the (historical) booked capacity at all exit nodes. By definition of truncation, this means that there exists a Gaussian random vectorξ ∼ N (μ, ) such that holds true for all Borel measurable subsets A ⊆ R m . Hence, in order to determine probabilities under a truncated Gaussian distribution, it is sufficient to be able to determine probabilities under a Gaussian distribution itself. Applying this observation to the probabilistic constraint (15) and combining it with (16) yields the equivalent description This is now, in contrast to (15) a conventional probabilistic constraint over a finite inequality system. We are aiming to apply an efficient gradient based solution algorithm in the framework of gradient based optimization methods. To this end, in order to deal algorithmically with the probabilistic constraint (17), one has evidently to be able to calculate for each fixed decision vector x − the probabilities occurring there, as well as their derivatives with respect to x − . In the following section we briefly sketch the methodology used here.

Spheric-Radial Decomposition of Gaussian Random Vectors
From the well-known spheric-radial decomposition (see, e.g., [7]) of a Gaussian random vectorξ ∼ N (μ, ) it follows that the probability of an arbitrary Borel measurable subset M of R m may be represented as the following integral over the unit sphere S m−1 : Here, μ χ refers to the one-dimensional Chi-distribution with m degrees of freedom, μ η is the uniform distribution on S m−1 and where P is a factor from a decomposition = P P T of the covariance matrix . Following these remarks, the probability on the left-hand side of (17) (depending also on the decision variable x − ) can be represented as where and, with P t denoting row number t of P , for k, l = 0, . . . , m: In order to evaluate the integrand in (18), one has to be able to characterize (for each given v ∈ S m−1 and x − ∈ R m ) the set E(v, x − ) and to determine its Chi probability. The latter task is easily accomplished, whenever the set E(v, x − ) can be represented as a finite union of intervals because there exist numerically highly precise approximations of the one dimensional Chi distribution function.
Hence, we are left with the task of efficiently representing E(v, x − ) as a finite union of intervals. This is easily done for the first set in the intersection providing E(v, x − ) in (19) which can be shown either to be empty or an interval: As for the second part of the intersection in (19), we will provide for each k, l an explicit representation of the set E k,l (v, x) either as a single interval or as the disjoint union of two intervals, such that the intersection of all these sets (and the first set determined above) is readily obtained in the form of a finite union of disjoint intervals. Indeed, upon developing the expressions in (20) in terms of r, one arrives at the representation where, for k, l = 0, . . . , m: This leads, by case distinction and elementary calculus, to the following explicit represen- where the case study is done according to Along with (21) we may use this explicit description in order to efficiently represent the set E(v, x − ) in (19) as the desired finite union of intervals by determining the finite intersection of sets which are intervals or disjoint unions of intervals.
It is important to note that, at the same time, the partial derivatives of the probability with respect to the decision variable x − can be calculated as a spherical integral of type (18) again, however with a different integrand which is easily obtained from the partial derivatives of the initial data [24]. In this gradient formula, the same disjoint union of intervals as in the computation of the probability itself is employed. The spherical integrals can be approximated by finite sums using Quasi-Monte Carlo sampling on the sphere (see, e.g., [4]). Then, for each sampled direction v on the sphere, one may update first the probability itself and then, simultaneously, the gradient of the probability with respect to x − by using the same disjoint union of intervals in both cases. This approach makes the gradient come almost for free as far as computation time is concerned. Having access to values and gradients of the probabilistic constraint (17), one may set up an appropriate nonlinear optimization solver for solving (15). For the subsequent numerical results, we employed a simple projected gradient method.

Numerical Results for an Example
As an illustrating example, similar to [16, Section 6], we considered a network as displayed in Fig. 1 with one entry (filled black circle) and 26 exits. The parameters (i.e., pressure bounds, roughness coefficients, truncated Gaussian distribution for the random nominations at exits) were chosen from modified quantities of real network.
The applied gradient method cannot guarantee to find a global optimal solution because the optimization problem is non-convex in x − . However, a stationary point can be computed satisfying the probust constraint with a high accuracy. We are able to control the quality of the accuracy via the Quasi-Monte Carlo sampling. In our computations a number of 10 000 samples turns out to allow for reasonably accurate results.
We did not assume any preferences in the allocation of new capacities, hence the weight vector in the objective of (15) was chosen as w − := 1 − . The colored rings around exit points refer to the optimal cumulative capacities (historical+new), i.e., C − + x − after maximization, upon choosing probability levels p = 0.95, 0.9, 0.85, 0.8. The radii of the rings are proportional to the available capacities. It can be clearly seen, how decreasing of the probability level allows for increasing the allocation of capacity in certain regions of the network. Figure 2 illustrates how the computed solution for a probability level p = 0.8 works for two random exit nomination scenarios ξ simulated a posteriori according to the chosen truncated multivariate Gaussian distribution. The first scenario is feasible because one could add a common capacity to every exit (green color) in order to satisfy this scenario. In contrast, the second scenario is infeasible because one would have to reduce the capacities by an amount corresponding to the dark red rings, in order to satisfy this scenario. When simulating a large set of such scenarios, say 1000, it would turn out that according to the probability level p = 0.8 approximately 800 are feasible, while 200 are infeasible.

Methodology in the Presence of Cycles
It is generally acknowledged that the presence of cycles in gas networks is both realistic for applications and demanding for formal analysis. In what follows we elucidate this at calculating the probability of feasible nominations in a gas network with cycles. Networks with a single or with multiple node-disjoint cycles are covered in [8] which essentially relies on explicit formulas for the roots of univariate real polynomials of degree less than 5.
If cycles in a gas network are sharing edges, then the approach via the mentioned formula is no longer valid. It also cannot be stretched to more general cases. A first alternative attempt in this respect has been undertaken recently in [9] for networks with up to three mutually interconnected cycles.
To display the state-of-the-art in calculating by spheric-radial decomposition probabilities of sets of feasible nominations in gas networks with cycles, consider again for every sample v 1 , . . . , v s on the sphere. Analogously to the case of trees, the set E i can be expressed as a finite union of disjoint intervals, E i = ∪ l j =1 [a j , b j ], for calculating its probability, it is sufficient to determine all points where the ray rP v i + μ enters or exits the set of feasible load vectors M.
With cycles, the matrix A decomposes into a basis part A B and a non-vacuous nonbasis part A N whose columns correspond to the fundamental cycles with respect to the tree behind A B . Accordingly q, are split into q B , q N and B , N . In [8] it is shown that a given load (−1 ξ, ξ ) is feasible iff there exists a q N such that with the function g : Having in mind the spheric-radial decomposition and the sets E i , we insert ξ(r) = rP v i +μ into the above characterization of feasible loads and reformulate the min, max expressions. Then E i consists of all r ∈ R ≥0 for which there is a q N such that To decide, for a given sample point v i , whether the ray rP v i + μ enters or exits the set of feasible load vectors M and, in the affirmative, compute an entry or exit point, the following basic procedure is possible: Augment one of the inequalities of the system (23-25) as an equation to (22) yielding a system of |N | + 1 degree-2 multivariate polynomial equations with |N | + 1 unknowns.
Notice that the above considerations hold for gas networks with arbitrary cardinality |N | of non-basis columns in A. Of course, the number of augmentations, and hence the number of passes through some polynomial-equation solver can become exorbitant.
A first attempt on solving systems with multivariate polynomials via computer algebra is reported in [9] for |N | ≤ 3. In the core of the method there are Gröbner bases inducing "triangular" representations of the polynomial equations, allowing for "reverse propagation" of solutions, in the spirit of Gaussian elimination, with multivariate quadratic polynomials instead of linear forms.
In contrast with gradient-type analytical solvers, algebraic solvers using symbolic computation usually detect infeasibility of the system under consideration much faster, which can be crucial as the decision of (in-)feasibility is one of the fundamental tasks in this context. These methods rely on iterating bases of ideals. Emptiness follows as soon as there arises a constant polynomial in the current ideal basis.
There are a number of possible improvements, some of which investigated in [9] that deserve further explorations: identifying and removing redundant inequalities in (23)(24)(25), studying the special structure of the system (22) and exploring the impact of "Comprehensive Gröbner Systems" that were developed for parametric polynomial equations, see [20].

Capacity Maximization Under Uncertain Loads and Uncertain Distribution of Entry Nominations
After discussing the methodology for the case of uncertain exit loads, we address the case of uncertain entry loads and fixed exit capacities, i.e., x − = 0, and we only take extensions x + of the entry capacities C + into account. Thus, we consider the following optimization problem: In other words, for a realization d of the random variable ξ and for an extension x + , we want every entry nomination of the uncertainty set to be feasible with probability p. The set of realizations of ξ for which every entry nomination is feasible for a given x + is henceforth denoted as D(x + ): Applying this notation, we can formulate the probust constraint of problem (26) more compactly: We note that the 'probust' nature of the constraint is 'hidden' in the probability constraint that is expressed in D(x + ).
Since ξ is based on historical data, the mean being feasible for a capacity extension x + is a natural assumption for practical applications at least if the upper bound y + is not too large. Furthermore, since there is, naturally, no infinite amount of capacity extension, such a bound y + does naturally exist.
In the following, we apply the spherical radial decomposition, see Section 3.2, to reformulate the probust constraint of problem (28) with an integral: As before, m is the number of exit nodes, S m−1 the unit sphere, μ χ refers to the one-dimensional χ -distribution with m degrees of freedom, μ η is the uniform distribution on S m−1 and where P is a factor from a decomposition = P P T of the covariance matrix .
We aim to solve problem (29) numerically and we approximate the integral. We briefly summarize the method of Section 3.2 for our purposes: We sample N vectors v 1 , . . . , v N of the unit sphere S m−1 and compute E(v i , x + ) for each sampled vector v i . Hence In [8], it is pointed out that E(v i , x + ) is a finite union of intervals: and that in this case, where F χ is the distribution function of the respective χ distribution. Now, for the numerical accessibility of (31) we make an additional assumption: (C2) For any x + ≥ 0, D(x + ) is star-shaped with respect to μ.

Using (C2), it is immediately seen that
Furthermore, due to condition (C1), we have 0 ∈ E(v i , x + ) which implies a 0 = 0. With this b 0 is the length of the interval E(v, x + ), i.e., b 0 = |E(v i , x + )|. As there exist highquality numerical approximations for F χ , the value of μ χ (E(v i , x + )) can be computed, if b 0 is numerically accessible.
In summary, under conditions (C1) and (C2), approximating the integral in (30) for a given capacity extension x + effectively reduces to sampling vectors v i on the unit sphere and determining |E(v i , x + )|.
Before we continue, we briefly discuss the role of (C2). First of all, it is important to state that without (C2) it is not clear, how (30) can be evaluated numerically. Second, to the best of our knowledge, there is no applicable criterion for testing (C2). That means, we need to assume that in practice (C2) holds. Fortunately, this is not as bad as it sounds at the first glance. The reason is that we are able to show that in the case where E(v i , x + ) consisted of more than one interval, our algorithm would always correctly approximate the length of the first interval, i.e., the interval with lower bound 0. Due to the simple estimate this would mean that, instead of approximating |E(v, x + )|, we would compute a valid lower bound. As a consequence, the computed result would be feasible for the optimization problem (29) and thus a (potentially conservative) valid underestimator of the true maximizer. During the remainder of this section, we will present and discuss an algorithm for approximating |E(v i , x + )| for any sampled v i . In particular, we will apply binary search: In every iteration, it will be decided whether a given r ≥ 0 is an element of E(v i , x + ). This decision is made by solving a nonlinear optimization problem (NLP) which, for all b ∈ U (μ + rP v, x + ), checks whether the henceforth called pressure flow solution (π, q) fulfills (8). We prove the correctness of this decision procedure under the following assumption: (C3) There is a node j ∈ V with fixed pressure, i.e., π j, * = π * j . Condition (C3) implies that there exists exactly one solution (π, q) which fulfills Aq = (b, ξ ) T and A T π = − |q|q, see for example Theorem 7.2. of [18]. As we will see later, the uniqueness of the pressure flow solution is crucial for the correctness of the presented algorithm.
In order to ensure that all potential violations of pressure bounds are detected, globally optimal solutions are required. In order to achieve this, we relax the NLP to a mixed-integer linear problem (MIP) by interpolating the nonlinearities with linear splines and modeling these splines through linear constraints and additional binary and continuous variables. The resulting MIP can be solved globally with off-the-shelf-software. The effects of the linearization are pointed out in the remainder of the section.
We would like to point out that, like condition (C1), condition (C3) is not a critical assumption in reality. Since gas is injected at some entry node, we can assume that the pressure at this node is known.
We conclude this section by the presentation of computational results that show the effectiveness of our method.

Methodology for General Stationary Networks
As discussed beforehand, approximating the integral under conditions (C1) and (C2) can be reduced to deciding whether r ∈ E(v, x + ) for a real number r ≥ 0 and a sampled vector v ∈ S m−1 . In other words, we need to check whether μ + rP v ∈ D(x + ), i.e., whether μ + rP v is robust feasible: Definition 1 Let d be a realization of the random variable ξ and let x + be an entry capacity extension. If d ∈ D(x + ), d is called robust feasible. The problem of deciding d ∈ D(x + ) is denoted as DecProb(d, x + ). Analogously, for a sampled vector v ∈ S m−1 , the real number r ≥ 0 is called robust feasible for v, if μ+rP v ∈ D(x + ), i.e., r ∈ E(v, x + ). The problem of deciding the robust feasibility of a number r for a vector v is denoted as DecProb(r, v, x + ).
In the special case of unbounded pressure at every node, DecProb(d, x + ) would be answered positively for every feasible realization d and every extension x + (see [18], Theorem 7.1). Although in real gas network operations and in our setting the pressures are bounded, we can use the following: We introduce penalty functions for every u ∈ V , in formulas, If f u (π u ) > 0 for a node u ∈ V , π u lies outside its bounds. Consequently, π ∈ [π * , π * ] if and only if u∈V f u (π u ) = 0.
Now consider a balanced nomination (b, d) T and the optimization problem max π,q u∈V f u (π u ) (33) Since the pressure is unbounded, there exists a pressure flow solution (π, q) and since condition (C3) holds, it is unique and the feasible set of problem (33) is atomic. Consequently,  (b, d) is a realizable nomination if the optimal value of problem (33) is 0, i.e., (32) holds.
However, for a given sampled vector v and a real number r, we want to check whether DecProb(r, v, x + ) is answered positively, i.e., if for any b which results in a balanced nomination (μ + rP v, b), there exists a pressure flow solution which satisfies (8). Therefore, we modify (33): The feasible set of Pen (r, v, x + ) consists of the vectors (b, π, q) for which (μ + rP v, b) is a balanced nomination and for which there exists a pressure flow solution which satisfies (8) but could violate the pressure bounds.
In particular, we will show in the following theorem that r is robust feasible for v if and only if the optimal value of problem Pen (r, v, x + ) is 0.

Theorem 1
Let v ∈ S m−1 , r ≥ 0 and let x + ≥ 0 be a capacity extension at the entries. Assume that condition (C3) holds. Then DecProb(r, v, x + ) is answered positively if and only if problem Pen (r, v, x + ) is solvable with optimal value 0.
Proof Since f u (π u ) is nonnegative for all nodes u ∈ V , the optimal value of Pen (r, v, x + ) is at least zero.
Assume on the one hand that DecProb(r, v, x + ) is answered positively, i.e., μ + rP v is robust feasible. Now consider a solution (b, π, q) which is feasible for problem Pen (r, v, x + ). If the objective value of (b, π, q) was strictly positive, there would exist a node j ∈ V with π j / ∈ [π * ,j , π * j ]. Since the pressure flow solution is unique (due to condition (C3)), this contradicts the robust feasibility of μ + rP v. Therefore, the objective value is 0. Since this applies for every feasible solution, the optimal value of Pen (r, v, x + ) is 0.
On the other hand, assume that the optimal value is 0. We maximize which implies that for all feasible solutions (b, π, q), π lies in its prescribed bounds. In other words, for every b ∈ U (μ + rP v, x + ), the unique pressure flow solution (π, q) satisfies (8). This implies the robust feasibility of (μ + rP v, x + ), i.e., DecProb(r, v, x + ) is answered positively.
We note that condition (C3) is crucial in the proof of Theorem 1. Without pressure bounds, the projection of the pressure flow solutions to the squared pressure component has, for a fixed π , the form [18]. Hence, unless one pressure is fixed, the pressure values are not necessarily unique and problem Pen (r, v, x + ) can be unbounded although r is robust feasible.
As already discussed, we need to determine the length of the interval E(v, x + ) and as a consequence of Theorem 1, problem Pen (r, v, x + ) can be used to determine the length. In particular, by applying a binary search, which solves the subproblem Pen (r, v, x + ) in every iteration, we can determine the length of E(v, x + ).
A binary search algorithm requires a lower and an upper bound. Since 0 ∈ E(v, x + ) ⊂ R ≥0 (condition (C1)), we choose 0 as a lower bound. A trivial upper bound for E(v, x + ) is given by the exit capacities However, we can even give a tighter bound. Due to (8), the pressure drop constraint holds for every arc (i, j ) ∈ E. Since the pressures are bounded and i,j > 0 for every arc (i, j ), we can derive flow bounds for every arc which do not depend on the actual nomination. In the following, these flow bounds, which are called implicit flow bounds and are denoted by q * and q * , are exploited to determine a tighter upper bound for E(v, x + ):

Lemma 1
Let v ∈ S m−1 , let q * and q * be the implicit flow bounds and x + ≥ 0 be a capacity extension at the entry nodes. For a node u ∈ V , let δ − (u) denote the set of incoming arcs and let δ + (u) denote the set of outgoing arcs. Then an upper bound for E(v, x + ) is given by the optimal value of the optimization problem Proof Since we are interested in an upper bound for E(v, x + ), 0 ≤ μ + rP v ≤ C − and r ≥ 0 are satisfied. Furthermore, Kirchoff's first law demands for a flow q. Substituting the flow variables by the implicit flow bounds q * and q * results in This concludes the proof.
With Lemma 1, the prerequisites for the binary search have been taken. However, problem Pen (r, v, x + ) is a non-convex non-linear optimization problem. Since we require global optima, we aim to linearize the non-linear constraints of problem Pen (r, v, x + ), i.e., the Weymouth equation j with linear splines s i,j (q i,j ). For a given linearization error > 0, the linear splines are constructed such that for every (i, j ) ∈ E. Therefore, in Pen (r, v, x + ), we relax (34) with The splines s i,j (q i,j ) are modeled with the incremental method, see [21], using an additional set of linear inequalities and equations and additional continuous and binary variables. Hence subproblem Pen (r, v, x + ) is relaxed to a MIP which can be solved to global optimality. The optimal value of the relaxation is an upper bound for the optimal value of problem Pen (r, v, x + ). Due to the objective being non-negative, if the optimal value of the linearized problem is zero, the optimal value of problem Pen (r, v, x + ) is zero as well. The linearized version of problem Pen (r, v, x + ) is henceforth denoted as Pen(r, v, x + , ).
Before we summarize our algorithm for finding a lower bound for the length of E(v, x + ), we note that the incremental method is applied for modeling linear splines which are defined on compact intervals. In our case, the spline variables are the flow variables which are, at least by definition, unbounded. In practice, one can for example apply the preprocessing developed in [1] for determining flow bounds. This method neglects the pressure bounds as well, which is the reason why we can apply it.
In the following, we summarize our procedure to find a lower bound for the length of E(v, x + ). The tolerance for the binary search is henceforth given by tol > 0: Algorithm 1 bounds |E(v, x + )| from below with an error of at most tol. Due to Theorem 1 and Lemma 1, Algorithm 1 terminates with a correct lower bound.
This concludes the presentation of our method to determine a lower bound for |E(v, x + )| under the conditions (C1), (C2) and (C3). We note that there are several sources of approximation errors which are caused by the binary search and the linearization of Pen (r, v, x + ). Yet, those can be limited by reducing tol and in Algorithm 1, respectively. However, one has to keep in mind that reducing tol results in more iterations and that reducing , i.e., a tighter linearization, results in more binary variables for Pen(r, v, x + , ). Both lead to an increase of the running time, see Section 4.2.
Before we discuss our numerical results, we would like to demonstrate, how our algorithm could be modified to produce a lower bound on E(v, x + ) in the case when condition (C2) fails to hold. As pointed out before, without condition (C2), the set E(v, x + ) is in general not an interval but a finite union of intervals. Due to condition (C1), one of those intervals, henceforth denoted as E 0 (v, x + ), has 0 as its lower bound. Obviously, , if all r ∈ [0, R] are robust feasible for v. Therefore, we can check whether R ∈ E 0 (v, x + ) by modifying problem Pen (r, v, x + ) and solving max r,b,π,q u∈V f u (π u ) The feasible set of this optimization problem consists of the vectors (r, b, π, q) for which (μ + rP v, b) is a balanced nomination (with r ≤ R) with pressure flow solution (π, q) which is unique due to condition (C3). Thus, the only difference in problem Pen (r, v, x + ) and the above optimization problem is r being a variable since we have to check whether μ + rP v is robust feasible for every r ∈ [0, R]. Consequently, using this modified problem in Algorithm 1 instead of problem Pen (r, v, x + ) yields a lower bound for E 0 (v, x + ) and thus a (potentially conservative) lower bound on E(v, x + ).
In the next subsection, we evaluate Algorithm 1 with respect to quality of the obtained solutions and running time.

Numerical Results
We modify the gas network instance of Section 3.3 by adding a second entry next to the first entry (the black filled node) which implies that x + ∈ R 2 . We note that the instance is still a tree and that the structure of the instance has not changed substantially which has been desired since we do not want to analyze an instance which is very different from the one of Section 3. However, if the instance had only one entry, the uncertainty set would have only one element which is not interesting in the context of this section. In addition, we fix the pressure at a leaf node. Beyond that, we provide sphere vectors v by sampling a collection of 10 000 elements on the unit sphere using a Quasi-Monte Carlo method. Our goal is to approximate the probability of robust feasibility for this network and uncertain entry loads by using a spheric radial decomposition and applying Algorithm 1. Since we can not verify condition (C2), we assume that D(x + ) is not star-shaped with respect to the mean.
The performance of the algorithm is investigated by testing the algorithm on the given instance under a variety of parameter combinations. All experiments were carried out using GUROBI 7.5 [15] with 4 threads running on machines with Xeon E3-1240 v5 CPUs (4 cores, 3.5 GHz each).
We apply Algorithm 1 to all 10 000 rays using all combinations of relaxation parameters ∈ {2 −6 , 2 −5 , . . . , 2 4 } and bisection termination tolerances tol ∈ {0.001, 0.01, 0.1}. Experiments for smaller tolerances down to tol = 10 −6 were carried out as well but are omitted here since they produced almost identical probabilities, when compared to tol = 10 −3 . The results of this study are displayed in Fig. 3. The approximated probabilities for robust feasibility of the exit nomination and the capacity extension of the instance are displayed in Fig. 3(a). We approximate the overall probability to be between 78 % and 78.5 %, depending on and tol. As expected, we obtain more conservative solutions for increasing approximation parameters . However, the influence of is much smaller than expected, even for large . In the same fashion, increasing the bisection termination tolerances tol leads to more conservative solutions. We note that for both parameters, a combination of = 1 2 and tol = 0.001 produces solutions that can be improved only very little (within the scope of this study) by decreasing both parameters further. The overall running times for all rays, i.e., the cumulated running time of Algorithm 1, applied for each ray, are plotted in Fig. 3(b). As expected, the running times increase for decreasing parameter , as the latter leads to more complex MIP models. A decrease of the tolerance tol leads to a larger number of iterations of the bisection algorithm and thus to longer running times as well. In the previous experiments, we focused on the influence of the relaxation parameter and the bisection precision tolerance tol on the algorithm's running time and on the reliability of the obtained probability. Another important impact on the overall running time is the number of rays that needs to be used. Fig. 4(a) shows the resulting probability when only the first k rays of the 10 000 given samples are used for the probability approximation. At a glance, we observe large fluctuations when using only up to about 2500 rays. A considerable decrease in the magnitude of the probability fluctuation can be seen for values of k ≥ 2500. We further strengthen this observation in Fig. 4(b) by comparing the first graph with a second graph that was obtained from 5000 other random sphere vectors in the same fashion. Since the second graph follows the same pattern, we conclude that for the instance considered here, the Resulting probability and total running time for 10 000 rays using different relaxation qualities and bisection termination tolerances tol number of rays should not be smaller than 2500 if the approximation of the probability has been supposed to be reliable. Assuming that the parameter selection k = 2500, = 1 2 , and tol = 0.001 is sufficient for a reliable probability calculation, the sum of all MIP running times is about 8 min.
As a final experiment, we demonstrate the practical applicability of our method by solving a simple optimization task where we assume that the number of sampled points k is large enough and that our approximation is good enough to check whether the probust constraint is satisfied for a given capacity extension.
The goal is to determine capacities for the two entry nodes such that the probability of robust feasibility is at least 75%. We use a linear cost function with equal costs for expansion at each node. In Section 3, the capacity problem with uncertain exit loads has been solved using (sub)gradient information in the sense of [24]. However, due to the different situation here caused by the MIP-approximations and the robust constraints, the derivation of suitable (sub)gradients needs further research that is beyond the scope of this article. Instead, we   Fig. 5 Contour plot of the probability for robust feasible entry capacities together with the trajectory of a gradient-free optimization method to determine a capacity with 75% feasibility decided to apply a gradient free pattern search algorithm available in MATLAB [19]. It is important to note that-due to the fact that the probust constraint is in general non-smoothno convergence of this algorithm to a stationary point can be guaranteed. Instead, one can expect convergence to a point in which directional derivatives are nonnegative for all directions in the positive basis used by the algorithm. We refer to [5] for a further discussion on this, as well as a general overview of derivative-free optimization. In every major iteration, a capacity extension x + has been proposed by the algorithm. For all sampled vectors v i , Algorithm 1 is applied to approximate |E 0 (v i , x + )|. Hence, the probability of robust feasibility is estimated from below by 1 2500 2500 i=1 F χ (|E 0 (v i , x + )|) and thus, the inequality 1 2500 In our experiment, we consider the entry capacities in the box [28000, 32000] × [4000, 8000] and start with C + = (−28000, −4000). In other words the extension x + is an element of the box [0, 4000] × [0, 4000] and our starting point is (0, 0).
Overall, the pattern search algorithm converged after 149 function evaluations. The red lines which connect the black, filled dots show the trajectory of the pattern search algorithm, with the optimum marked by a red cross. The extra function evaluations are represented by black circles. Obviously, the probability is not very sensitive with respect to capacity changes at entry 1 but clearly decreases, when the capacity is increased at entry 2. (Fig. 5) This concludes the discussion and presentation of the methodology for stationary gas networks. In the next section, transient systems controlled by the wave equation are discussed.

Stabilization with Probabilistic Constraints of a System Governed by the Wave Equation
Now, we consider a transient system that is governed by the wave equation. The wave equation is a linear model of the gas flow in gas pipelines for sufficiently small velocities. The state is determined by an initial boundary value problem with Dirichlet boundary data at one end and Neumann boundary feedback at the other end of the space interval. The initial data and the boundary data are given by a stochastic process. The aim is to maximize the probability to stay near a desired state everywhere in the time space domain.
Let a finite length L > 0 and a finite time T > 0 be given. In this section, let U = [0, T ] × [0, L]. Let c > 0 denote the sound speed in the gas. Let a stationary velocity field v be given, see [13]. Let v =ṽ −v denote the difference between the velocity and the stationary state. If the norm ofv is sufficiently small, the dynamics for v are governed by the wave equation v tt = c 2 v xx . Moreover the gas density ρ satisfies the wave equation ρ tt = c 2 ρ xx and the flow rate q of the gas satisfies the wave equation q tt = c 2 q xx ; see [14]. For given uncertain boundary data (that model the uncertain demand) ξ ∈ L ∞ (0, T ), an uncertain initial state (v 0 , v 1 ) ∈ L ∞ (0, L)×L 1 (0, L) and a feedback parameter η > 0, we consider the closed loop system that is governed by the initial boundary value problem An explicit representation of the generated state in terms of travelling waves (d'Alembert's solution) is given in [11,12]. This allows the computation of the system state v ∈ L ∞ (U ) without discretization errors. In the operation of pipeline networks, there is a constraint on the magnitude of the flow velocity in the pipe. Let v max > 0 be an upper bound for the velocity. We consider the probabilistic constraint for the probability where v solves (S) and · L ∞ denotes the norm on L ∞ (U ).
In order to write the probabilistic constraint similar to (4), we introduce the notation x) ∈ U , whereṽ solves the initial boundary value problem (S) with initial and boundary data that depend on the parameter (a, b) (see (KL-id) and (KL-bd) below).  Remark 1 If ω = 0 and T is sufficiently large, then v ∞ = |λ| holds.

Karhunen-Loève Approximation of a Wiener Process as Initial and Boundary Data
We consider the Karhunen-Loève representation (see [17]) of a Wiener process on [0, T ] with covariance function Cov(W t , W s ) = min(s, t) given by a k sin ω k π t T ω k π , ω k = k − 1 2 , with independently normally distributed random variables a k . It is reasonable to use a finite approximation of it as boundary data, i.e., with independently normally distributed random variables b k . We have the compatibility condition ξ(0) = v 0 (L) = 0. Furthermore, set v 1 = 0 (Fig. 7). Different realizations of the initial and boundary data can be seen in Figs. 8 and 9. The solution of the wave equation for different realizations of the initial and boundary data is depicted in Fig. 10.
The case is much more involved than that in Section 5.1, since the value of v L ∞ is not easily expressed as an analytic function of the random variables. This means a sampling scheme based on spheric radial decomposition can not be directly be applied. We use a quasi Monte Carlo method based on a Sobol sequence instead.
If one wants to approximate the L ∞ -norm of the velocity by pointwise evaluation on a grid, Lipschitz continuity of the velocity is required.
Theorem 4 (Lipschitz continuity of the solution) Assume the boundary data ξ ∈ C 0,1 (0, T ) and initial data v 0 ∈ C 0,1 (0, L) to be Lipschitz continuous and assume that Lipschitz compatibility over the edge holds, i.e., we have with a Lipschitz constant K > 0. Furthermore, let v 1 ∈ L ∞ (0, L). Then, under the assumptions of Theorem 2, the solution v of system (S) is Lipschitz continuous on U , i.e., v ∈ C 0,1 (U ).
Proof The sum of Lipschitz continuous functions is Lipschitz continuous. It is therefore sufficient to show the Lipschitz continuity of α and β defined as in Theorem 2. Without loss of generality-by going to the maximum of the occurring Lipschitz constants-we assume that they are all the same and denote each of them by K > 0. First, we show the Lipschitz continuity of V 1 . We have, for x, y ∈ [0, L] |V (x) − V (y)| =

Optimization of the Feedback Parameter
The feedback parameter η can be chosen such that the probability (35) as a function of η is maximized. We call this function G(η). We consider the probability to stay under the bound v max = 5 for different feedback parameters η > 0 on a grid with stepsize 0.05 between 1.5 and 4. The data for the example has been chosen as L = 2, T = 2, c = 0.5. For the approximation of the probability, 2000 samples were used for each value of η. The maximum of the probability is reached for completely absorbing feedback η = 1/c = 2. The peak in probability is very distinct. At the peak the probability function appears to be nonsmooth. Numerically, we find that the choice η = 1/c is optimal; see Fig. 11.

Conclusion
In this paper we dealt with a joint model of probabilistic and robust constraints, so-called probust constraints and illustrated their importance for gas transport under uncertainty. In particular, we addressed the problem of capacity maximization under uncertainty thereby distinguishing between the cases of uncertain exit and uncertain entry loads. Moreover, we considered a stabilization problem in a transient system governed by the wave equation and subject to probust constraints. By applying the spheric radial decomposition of Gaussian random vectors, we approximated the occurring probabilities and-where possible-their sensitivities with respect to the decision variables in order to numerically solve the resulting optimization problems. There are a lot of remaining challenges for future work, such as efficient incorporation of cycles or active elements in the network. Moreover, a full integration of the methodology outlined in Section 4 for the robust treatment of uncertain entries with the capacity maximization problem described in Section 3 ultimately would allow an application of the probust approach to arbitrary network topologies.