Controlling transient gas flow in real-world pipeline intersection areas

Compressor stations are the heart of every high-pressure gas transport network. Located at intersection areas of the network they are contained in huge complex plants, where they are in combination with valves and regulators responsible for routing and pushing the gas through the network. Due to their complexity and lack of data compressor stations are usually dealt with in the scientific literature in a highly simplified and idealized manner. As part of an ongoing project with one of Germany's largest Transmission System Operators to develop a decision support system for their dispatching center, we investigated how to automatize control of compressor stations. Each station has to be in a particular configuration, leading in combination with the other nearby elements to a discrete set of up to 2000 possible feasible operation modes in the intersection area. Since the desired performance of the station changes over time, the configuration of the station has to adapt. Our goal is to minimize the necessary changes in the overall operation modes and related elements over time, while fulfilling a preset performance envelope or demand scenario. This article describes the chosen model and the implemented mixed integer programming based algorithms to tackle this challenge. By presenting extensive computational results on real world data we demonstrate the performance of our approach.


Introduction
Throughout the past years, the mathematics of gas transport has been an intensively studied topic. While natural gas was, is, and will be one of the major energy sources in Germany, making the efficient and safe transport a field of high economical and political relevance [9], the task is also challenging from a mathematical point of view. One such challenge can be found in the compressors, which push the gas through the network by increasing its pressure. Compressors are typically set up as a compressor station, whereby multiple compressor units can be placed in specific configurations and dynamically adjusted to meet the current needs, allowing for different compression ratios and flow rates. At intersections of major transportation pipelines, arrangements of multiple compressor stations as well as other elements like valves or pressure regulators can be found. Such an arrangement makes it possible to choose different connections of the intersecting pipelines, and operate the system for various flow directions and pressure levels.
To optimize control of these areas taking all the technical restrictions into account is already combinatorially challenging. The complexity of the problem is further increased by the physics of gas flow. For pipes this physics is described by the Euler Equations [26], a set of nonlinear hyperbolic partial differential equations (PDEs), which even in simplified versions yield computationally challenging constraints.
Historically, research focused first on the simulation of gas flow, i.e., dealing with the partial differential equations given all the discrete decisions. This field has been studied for many decades already, see for example [3] and the references therein. Over recent years, the optimization of gas transport including also the combinatorial aspects have gained more and more attention. In [29] a general overview over optimization problems related to natural gas is given, which includes but is not restricted to the transport of gas. Most of the corresponding literature mentioned so far considers the stationary gas transport problem, which searches for one stable network state, making an algebraic description of the gas flow possible. An overview of state-of-the-art approaches for the stationary case can be found in [19] and [28], which consider large real-world instances and a huge amount of detail regarding the different network elements like compressors.
This article deals with the more challenging variant of the problem: The transient gas transportation problem. Here, the goal is to find a set of control decisions on the elements over a future time horizon. For this problem, research is still in early stages. One of the first publications on transient gas transport optimization was [23], who presented a mixed integer programming (MIP) model for the problem. In contrast to the structures described above, they only used a model consisting of single compressor units, whose compression capabilities are limited by a minimum and maximum power bound. This non-linear power bound as well as the non-linear pipe equations are approximated by piecewise linear functions. To solve the model for the objective of minimizing the compressor fuel costs, they used a special branching scheme for the piecewise linear functions as well as a simulated annealing heuristic described in [21]. A little later, [6] also presented a solution to the problem of minimizing the compressor fuel costs. They modeled the problem according to [23], using the same model for compressors and approximated the non-linear constraints by piecewise linear functions. However, they combined solving this MIP with solving a non-linear problem (NLP) formulation of the problem in an alternating way. From the solution of that non-linear problem they deduce a refinement of the piecewise linear approximations and repeat this procedure until finally arriving at a solution to the overall mixed-integer non-linear problem (MINLP) within a chosen approximation error. Other approaches tackle transient transport optimization problems, but neglect the discrete nature of some of the elements and therefore purely optimize over continuous variables, i.e., solve NLP problems. We mention as example the work of [33] and [22], who decide on the compression ratios of compressors, while again minimizing their fuel consumption. Very recently a few more studies on transient gas network optimization have been published. In [13] a specialized branching rule is used to solve a MINLP formulation of the problem with the objective to minimize fuel consumption. For the compressors they introduced the theoretical concept of different modes to switch between configurations, which each have separate feasible region. However, in the end they restrict to exactly one mode with nearly unrestricted compression capabilities for their experiments. Another approach combining different specialized solving techniques is presented in [11], where iteratively a MIP model and a NLP model are solved for each single time step. These two models arise from the use of a special discretization of the Euler equations. For compression, both models use single compressor units featuring a linear feasible region. In contrast to the other mentioned publications, the objective function in [11] was not to minimize fuel cost, but to comply as well as possible with a set of future pressure and flow values given at the boundary nodes. Finally we mention [4], who considered maximizing the amount of temporarily stored gas in the network while maintaining a feasible transient control of the elements. They also introduce a new discretization of the Euler Equations, which results in a formulation close to the algebraic form of the stationary model. They then use this discretization to obtain globally optimal solutions. For the compressors however, again only single units restricted by upper and lower bounds in the compression ratio as well as the absolute pressure difference have been modeled. This problem is solved by alternating between solving a MIP model, which is obtained by replacing the non-linear constraints by piecewise linear functions, and a NLP model, in which the discrete decisions from the MIP are already fixed.
All approaches of the above mentioned publications on transient gas network optimization use a rather idealized model to deal with compressor stations. In contrast to this, we present in this article a transient gas network optimization problem featuring a compressor model with a so far unmatched amount of detail. The model is based on the modeling presented in [19] for the stationary case and takes the above described substructures into account. We focus on those network areas containing the compressor stations as well as additional active elements and call them network stations, an example station is presented in Figure 1. These areas contain the majority of active elements in the network. Regarding the number of contained elements they are comparable to, or even larger than the networks considered in the above mentioned literature on transient optimization. However, one difference to general gas network problems is the shortness of pipes due to the proximity of the elements in a network station, see Table 1 in Section 5 for an overview of different network stations. Because of their shortness, the pipes ability to store gas is negligible owing to their small volumes. Furthermore, the pressure loss induced by friction in the pipe is dependent on the its length and has therefore reduced impact. This allows us to use a linear pipe model as introduced in [16] without losing much accuracy and still producing realistic results.
For each network station we are given an initial state as well as future demands in terms of both inflow and pressure levels at the boundaries. The goal is then to find a feasible control of all the network elements over time, which can be interpreted as a recommendation for network operators on how to control the network in the future. The overall objective is to meet the future demands as best as possible while simultaneously minimizing the total number of control changes. The latter is preferable since it reduces strain on the technical elements and enables the gas network operators to understand and actually perform the desired control recommendations.
The rest of the article is structured as follows: In Section 2 we will describe the mathematical models for all used elements and formulate a corresponding MIP model. The preprocessing needed to convert the given compressor data into a linear description of the feasible operating range will be introduced in Section 3. However, solving the resulting MIP is quite challenging due to the  Table 1 for more details to its properties. The colored triangles represent the entry and exit nodes of the station. Furthermore we have denoted the single network elements by (pipe), (valve), (regulator), and (compressor station). complex compressor station model. We therefore propose a different solution approach in Section 4 based on solving slightly adjusted versions of the presented MIP. Section 5 then presents our results on computing solutions to a large number of real world networks situations. We finish with the conclusion in Section 6.

Mathematical model
We model the gas network as a directed graph G = (V, A) in which the arcs A represent the different network elements and nodes V represent the junctions of the arcs. We split A into individual sets A = A pi ∪ A va ∪ A rs ∪ A rg ∪ A cs for the network elements considered in this paper, i.e., pipes, valves, resistors, regulators, and compressor stations respectively. Note that regulators are also often named control valves in the literature, e.g. see [8] or [19]. In a similar fashion we split the node set V = V b ∪ V 0 into boundary nodes and inner nodes respectively. Here, boundary nodes V b represent those having inflow and pressure level demand values for future time steps. We define the set of considered time steps as T 0 := {0, . . . , k} where T := T 0 \ {0} are the future time steps having demand conditions at boundary nodes. Associated with each time step t is a value τ (t) representing the time difference in seconds from t to the initial state time step 0. The most important quantities we will consider to describe the gas flow are pressure and mass flow. We have pressure variables p v,t at each node v ∈ V and time t ∈ T 0 as well as variables q a,t for the flow from l to r on each non-pipe arc (l, r) = a ∈ A \ A pi and time t ∈ T 0 . For a pipe (l, r) = a ∈ A pi , we have two flow variables q l,a,t representing the inflow into the pipe at end node l and q r,a,t representing the outflow out of the pipe at end node r, similar to the model of [8] on pipes or the one of [6] on all arcs. Our final flow quantity is the inflow at a boundary node d v,t entering the network for each v ∈ V b and t ∈ T 0 . Although we are given flow demands for the future, we allow deviations from them, and hence have to have a variable capturing the actual inflow value.
We assume that there are bounds on all stated quantities for each point in time t ∈ T 0 , so upper and lower boundsp v,t andp v,t on the pressure at each node v ∈ V, upper and lower boundsq v,t andq v,t on the flow on each non-pipe arc a ∈ A \ A pi respectively the inflow and outflow of each pipe a ∈ A pi , as well as upper and lower boundsd v,t andd v,t on the inflow at each boundary node v ∈ V b . Note that while pressure is always positive, flow itself can be negative, as it can represent flow in the opposite direction. For example, negative inflow at a boundary node represents flow out of the network at this node.
In the following we will describe each of the elements and at the same time introduce corresponding variables and constraints to create a MIP formulation. This MIP will not be solved directly, but is the basis for the three variants used in our overall solution algorithm described in Section 4.
Note that a list of all used variables can be found in the appendix in Table A.1.

Compressor stations
Compressor stations are responsible for increasing the pressure in the network and thereby the most important in controlling the flow of gas. They are also the most complex elements, having their own substructure and a large amount of operational restrictions. Our model of the compressor station is based on the description in [10], where the elements are called compressor groups instead.
Structure of a compressor station A compressor station (l, r) = a ∈ A cs has three different modes: Bypass, closed and active mode. In bypass mode the element is bypassed and therefore allows unrestricted gas flow without changing the pressure level. For the closed mode, the element is closed and thereby blocks the gas flow, which disconnects the network between the its end nodes. Finally, in active mode the gas is compressed and pressure is increased along the direction of the flow. When compressing, the compressor station can use a set of associated compressor units U a . These are the actual pressure increasing elements, each with a separate operating range. In the compressor station, these compressor units are combined in series and/or parallel to allow proper reactions to different compression requirements. The set of all allowed serial-parallel compressor unit combinations is called the set of configurations C a for a compressor station a ∈ A cs , from which exactly one active configuration has to be chosen if the compressor station is in active mode. For each of these configurations c ∈ C a , we create a polytope in the space (p l , p r , q) describing the feasible operating range of the compressor station using configuration c. This polytope is described as the intersection of a set of half spaces H c = {(w, x, y, z) ∈ R 4 } encoding inequalities of the form w · p l + x · p r + y · q + z ≤ 0. The creation of the feasible operating range of the configurations of the compressor stations is described in Section 3.
Compressor station model To model the above described constraints we use a disjunctive formulation. This is the most compact formulation in terms of number of constraints and variables for which holds that its LP relaxation is equal to the convex hull of its feasible points [1] [2]. For this we introduce binary "selection" variables m cf c,a,t for each configuration c ∈ C a of a compressor station (l, r) = a ∈ A cs , as well as m by a,t and m cl a,t for bypass and closed mode respectively. In addition we introduce corresponding sets of pressure and flow variables. The binary variables will force the pressure and flow variables of all non-selected configurations or modes to be zero and only enforce the constraints of the selected configuration or mode. The introduced pressure and flow variables are: p by a,t q by a,t bypass mode variables Note that we need only one p value for bypass mode, since here p l = p r holds. Also there is no q variable for the closed mode, since q = 0 holds in this case anyway. Furthermore, all introduced variables have bounds equal to the corresponding original pressure and flow boundsp l,t ,p l,t ,p r,t ,p r,t ,q a,t ,q a,t of the compressor station (l, r) = a ∈ A cs , enlarged to zero if necessary. We indicate these bounds by the variable symbol combined with an overscore respectively underscore.
We are now able to state the constraints for all (l, r) = a ∈ A cs and t ∈ T : p l−cf c,a,t m cf c,a,t ≤ p l−cf c,a,t ≤p l−cf c,a,t m cf c,a,t ∀c ∈ C a (5) q cf c,a,t m cf c,a,t ≤ q cf c,a,t ≤q cf c,a,t m cf c,a,t ∀c ∈ C a (7) p by a,t m by a,t ≤ p by a,t ≤p by a,t m by q by a,t m by a,t ≤ q by a,t ≤q by a,t m by w · p l−cf c,a,t + x · p r−cf c,a,t + y · q cf c,a,t + zm cf c,a,t ≤ 0 ∀(w, x, y, z) ∈ H c ∀c ∈ C a (12)

Pipes
Gas flow in pipelines is for operational purposes modeled as one-dimensional flow through a straight cylindrical pipe. When assuming a constant gas temperature Here x denotes the position in the pipe by its distance from the source node l, t the current time, ρ the density of the gas, v its velocity, D a the diameter of the pipe, g the gravitational acceleration, and λ a the friction factor of the pipe, which we assume to depend on pipe characteristics only, see Section 2.2.1. With s a ∈ [−1, 1] we denote the slope of the pipe, i.e., the quotient of the elevation increase between the pipes endpoints and the length L a of the pipe. In order to complete the system of equations describing the state variables p, ρ, and v, we add the equation of state for real gases to establish the connection between p and ρ as p = ρR s T z a .
The two new quantities that arise are the compressibility factor z a and the specific gas constant R s . We assume both values to be constant parameters, which for the compressibility factor is a common assumption in the gas transport literature, see for example [26] [4]. For the specific gas constant this follows from its dependence on the molar mass, which in turn is determined by the gas mixture which we assume to be constant. In the following, we will drop the terms ∂ t (ρv) and ∂ x (ρv 2 ) as they contribute only little to the equation under normal operating conditions [8] [26]. In addition, we reformulate the pipe flow equations in terms of the quantities we are interested in, i.e., pressure p and mass flow q, where q is defined using the cross sectional area A a = D 2 a π 4 of the cylindric pipe a as q = A a ρv.
Then we can write (13) and (14) as Since the spatial pressure change now mainly depends on the friction term including the friction factor λ a this model variant is often referred to as the friction dominated model, see model (FD1) in [3] respectively model (ISO3) in [7].
For the discretization, we use the implicit box scheme introduced by [6], respectively [20]. Here, the spacial domain is the length L a of pipe a = (l, r) and the time domain is the set of time steps T 0 , which we defined in the beginning of Section 2. Using the notation of flow into and out of a pipe, again defined in beginning of Section 2, as well as the function τ , we are able to write the discretized model for two adjacent time points t 0 and t 1 as a |q l,a,t1 |q l,a,t1 p l,t1 + |q r,a,t1 |q r,a,t1 p r,t1 + gs a L a 2R s T z a (p l,t1 + p r,t1 ) = 0. (18) As a final step, we will linearize the Momentum Equation (18) as proposed in [16] by fixing the absolute velocity |v| to a predefined constant in the friction term, where the absolute velocity is according to (15) and (16) defined as |v| = | RsT za Aa q p | = RsT za Aa |q| p . We found that the relative error stated in [16] is less relevant for the overall accuracy of the model, if the pipes are contained in network stations. The reason for this is, that the elements of a network station are usually clustered in a small geographic area. Therefore the contained pipes are relatively short in comparison to the rest of the network, see also Table 1 with statistics over different network stations. Since the friction based pressure reduction depends on the pipe length, the corresponding term and therefore the also stated relative error have much less impact than usual. This linearization allows us to model the pipe flow in a MIP context. The final equations for each pipe (l, r) = a ∈ A pi and all adjacent time points t 0 , t 1 ∈ T 0 are: p r,t1 − p l,t1 + λ a L a 4D a A a (|v l,a |q l,a,t1 + |v r,a |q r,a,t1 ) + gs a L a 2R s T z a (p l,t1 + p r,t1 ) = 0 (20) The constant |v x,a | for one of the end nodes x ∈ {l, r} of pipe a = (l, r) is determined based on the flow and pressure values of the given initial state, i.e., |v x,a | = R s T z a A a |q x,a,0 | p x,0 .

Friction and compressibility factor
The Darcy-Weisbach friction factor λ, describes the pressure drop on a pipe a caused by frictional forces and depends on the diameter D a and integral roughness k a of the pipe, as well as the current flow q and the dynamic viscosity η of the gas. For turbulent gas flow the most accurate description is given by the implicit Colebrook-White equation [5][10]. There exist a series of different explicit approximation formulas, typically depending on the Reynolds Number which describes the amount of turbulence of the flow. We use the formula of Nikuradse [24][10], which assumes infinite turbulence and makes the friction factor dependent only on the constant diameter D a and integral roughness k a of the pipe: For the compressibility factor z we use the approximation formula developed by Papay [27] [31], which is valid up to 150 bar and thus fits our considered pressure range of 1 bar to 100 bar well. It is given as Apart from the pressure p and gas temperature T Papay's formula depends on the gas mixture dependent pseudo-critical pressure p c and temperature T c , which we assume to be given constants. The constant compressibility factor z a per pipe a = (l, r) is then determined as an average of the corresponding values derived from the initial state pressure values of the pipes end nodes, i.e., by z a = (z(p l,0 ) + z(p r,0 ))/2.

Resistors
Resistors are artificial elements created to model points of high friction in the network, which can be caused by all sorts of special elements. Some examples of these elements are measuring equipment and complex local piping, which both are not captured by the other considered element types but need to be accounted for. The pressure drop caused by a resistor arc (l, r) = a ∈ A rs for time t ∈ T is defined by the Darcy-Weisbach equation [10]: Here the friction factor of the resistor is called the drag factor, and is represented by ζ a , a parameter of the element. The compressibility factor z a is determined in the same way as described for pipes, see Section 2.2.1. The formula is flow direction dependent, where p in,t is either p l,t or p r,t depending on q a,t being positive or negative (for q a,t = 0 it holds that p l,t = p r,t ).
As we did for pipes in Equation (20) we linearize the model by assuming a constant velocity |v| = RsT za Aa |q| p , which also includes the flow direction dependent pressure value. The equations for each arc (l, r) = a ∈ A rs and time t ∈ T then reads as The constant velocity value is again calculated based on the initial element state, and is defined as an average of the two velocities using the pressure from the corresponding resistors end nodes as

Valves
Valves are active elements that dynamically connect or disconnect two nodes by being open or closed respectively, and thereby change the network topology. Closed valves work exactly as the closed mode of a compressor station, while open valves imply the same behavior as the corresponding stations bypass mode. The mode of a valve is captured by the binary variable m op a,t . For a valve arc (l, r) = a ∈ A va and time t ∈ T we write

Regulators
A regulator or control valve is a valve with variable opening degree, used to reduce the pressure along the direction of the flow. Regulators can reduce the pressure in active mode and also be bypassed or closed like a compressor station. For each mode there is a binary variable, from which exactly one is equal to 1 at any time, i.e., the regulator always has to have a unique mode The implications of each of the modes can be modeled by the following constraints for each arc a ∈ A rg and all times t ∈ T q a ≥ 0.
Note that the flow is always positive, even in bypass mode. This occurs because all regulators have a flap trap which prevents flow going against the topological orientation.

Nodes
The nodes don't represent technical elements but rather establish the connections between them. The pressure coupling is realized by using the pressures at a node in the constraints of all its incident arcs. To connect the mass flow between arcs we have flow conservation constraints in each node. This means that the sum of incoming flows should match the sum of outgoing flows, resulting in the following constraints for all t ∈ T :

Network station
In addition to the constraints imposed by the single elements used in the network, the network station itself has a set of associated constraints. The model described below is an extension of the one described in [10] for similar structures. There, network stations are called compressor stations and the elements we call compressor stations are called compressor groups, as already mentioned in Section 2.1. Operation modes model We introduce binary variables m om o,t for each o ∈ O and t ∈ T 0 . They represent whether the operation mode has been selected, or respectively if the network station is in the given operation mode at this point in time. Furthermore, we define the function M (o, a), which maps operation modes to the individual modes and configurations of valves and compressor stations:

Network station structure
Note that we assume that all valves and compressor stations are determined by the network station operation modes. In general however, there are valves which are not controlled by the operation modes, and whose mode is already given as a fixed decision over time and cannot be changed. We can handle these valves by pre-processing, and either contracting the associated valve in the open case, or simply removing the valve in the closed case. Using M (o, a) we can then state the operation mode related constraints for all t ∈ T : Note that either Equation (35), Equation (36) or one of the constraints of Equation (37) for one of the configurations c ∈ C a can be omitted, because it follows from the remaining constraints combined with Equation (1).
Operation mode unavailability Certain operation modes are not available at specific points in time. The basis for this is the non-availability of compressor units over time, which is part of the model input data. As explained in Section 2.1, a configuration c ∈ C a of some compressor station a ∈ A cs represents the serial and/or parallel combination of a subset of the compressor stations compressor units. Hence, the unavailability of a certain compressor unit at time t results in the unavailability of all configurations which use this unit. On the next level, each network station operation mode defines the mode and (for the active mode) the configuration of each compressor station in the network station. Hence, all network station operation modes using a configuration for a compressor station which is unavailable for time t will be unavailable for t, too. To implement this in the model, we just fix the variables m om o,t for the corresponding operation mode o and time points t to zero, i.e., remove them from the model.
The unavailability of a compressor unit may not be aligned with the set of discrete time points T 0 , i.e., the unavailability period may start or stop in between two adjacent time points. To be able to tell which of the two points is then effected by this, we have to establish an interpretation for the operation mode of a network station between two time steps. Therefore, we define that if a network station has the operation mode A at the discrete time point t, then we also assume the station to have operation mode A in the following time interval up to the next discrete time point t + 1. From this definition it follows that if a network station operation mode is unavailable for some k ∈ (t, t + 1) with t, t + 1 ∈ T 0 , then the operation mode is unavailable for time t but potentially available for time t + 1.
Operation mode transition times If the operation mode of a network station is changed from mode A to some other mode B, the transition takes a given amount of time θ(A, B). A transition time is given for every possible combination of operation modes and is part of the input data. While in transition between the two modes, the network station acts as follows: Assume the transition starts at time t 0 , then for t ∈ t 0 , t 0 + θ(A,B) 2 the station uses mode A while for t ∈ t 0 + θ(A,B)) 2 , t 0 + θ(A, B) the station uses mode B. In other words, the station stays in mode A until it reaches the middle of the transition period and then changes to mode B. While being in transition, the whole transition period is blocked for other changes, i.e., two transition periods cannot overlap. Since we are only able to change network station operation modes at discrete time points, we assume that for each transition the middle point is in T 0 . This is also in line with our interpretation of network station operation modes in between discrete time points.
In Figure 2 we see an example of an operation mode sequence with corresponding transition times. The time point t X represents the first t ∈ T 0 in which operation mode X is active, which according to the interpretation in the previous paragraph is also the first discrete time point t in which X is active. In this example, there would be a conflict in the transition times θ(C, D) from mode C to mode D and θ(D, E) from mode D to mode E, since the two time periods overlap. Note that this is true, although for each single transition period the network station has the correct operation mode for each point in time, i.e. first mode in the first half of the period and the second mode in the second half.
Time We will not cover the transition time restrictions in the MIP model, but will make sure our solutions respect them in a different way, see Section 4.2. For this we only need a way to check for a given sequence of operation modes if all the transition times are valid, i.e., the corresponding periods do not overlap, which we do as follows: For each network station operation mode in the sequence, we check if the time in which the operation mode is active is at least as big as the sum of adjacent transition time components, i.e., the sum of half the time of the transition into that mode and half of the time of the transition out of that mode. In the given example, we would have to check holds, where τ (t) represents the time difference of a time step t from the initial state time. If the above holds for all operation modes in the sequence, then the transition periods can never overlap since they are centered around the time points in which the mode change happens.
Note, that for the last mode in the sequence, we do not need to do any checks at all. We do so as we cannot determine how long into the future the operation mode will be active for. Hence, we assume it is active long enough to comply with the given transition time, allowing the possibility to apply operation mode changes up to the very last time step. For the first mode, a similar assumption would be too optimistic, since the desired operation mode change would have then already been triggered before the start our time horizon. However, there is no known transition into this first mode, so we need only check if half the transition time into the next mode fits into the first modes active period.
Flow directions model Similar to the way we handle operations modes, we introduce binary variables m fd f,t representing the selection of flow direction f ∈ F at time t ∈ T 0 . To be able to state the connection between the chosen flow direction and the actual boundary node inflow pattern, we represent each flow direction f as a tuple of the set of boundary nodes having inflow into the station f + and the set of boundary nodes having outflow out of the station f − . Hence, using the power set P, a flow direction f is defined as Using the inflow variable d v,t for boundary node v and time t we can define the flow direction constraints for each t ∈ T as: Flow direction exit pressures Apart from the consequences that the flow direction choice has on the corresponding boundary node inflows, it may also influence the upper pressure bounds on some boundary nodes. Each node v ∈ V b−ex ⊆ V b is given an upper pressure boundp exit v , which is only active if the node is in the outflow set of the currently active flow direction, i.e., serving as exit of the network station. The corresponding constraint is the following for each time t ∈ T Flow direction conditions As the final constraints concerning flow directions, there exist in some network stations a special set of conditions W, which concern the amount of flow over sets of boundary nodes. These conditions must be met for the flow direction which they are associated with to be active. Each condition w = (f, V w1 , V w2 ) ∈ W states that the flow over a set of boundary nodes V w1 has to be smaller than the flow over a second set of boundary nodes V w2 if f is selected. Note, that the flow over a set of boundary nodes V w for time t is defined as v∈V w |d v,t |, which is potentially a non-linear expression due to the absolute value. However, each boundary node set V w which is part of some w ∈ W is known to be either a subset of f + or a subset of f − of the corresponding flow direction f = (f + , f − ). For this reason, we always know the sign of the flow over V w in advance and hence using the following function definition for easier notation Here C 1 denotes a big-M constant, which can be set as follows:

Scenario and initial state
For the future we are given scenario values for the boundaries of the station in terms of pressure and inflow. While we are given one pressure valuep v,t per boundary node v ∈ V b for each future time point t ∈ T , the flow demands are only given for sets of boundary nodes, which are called the fence groups of the network station forming the set G. For each set g ∈ G, which can also consist of only a single boundary node, and each future time point t ∈ T , the sum of inflows should be equal to the given demand valued g,t .
We do not require strict obedience of the given demand valuesp v,t andd g,t , but instead allow deviations from them, which will be punished in the objective function. These deviations are captured in the slack variables σ. For pressure we have σ p+ v,t and σ p− v,t which capture the positive and respectively negative difference between the pressure values of boundary node v at future time t and the given demandp v,t . We then have the slack variables σ d+ v,t and σ d− v,t associated with the inflow, which capture the positive and respectively negative contribution to the difference between the inflow demandd g,t of fence group g ∈ G at each future time step t and the sum of the corresponding inflow variables of each boundary node v in the fence group g.
The described relations can be modeled for each future time step t ∈ T as: Note that by using inflow slacks, it is possible to choose a flow direction for the station that does not fit the inflow demand valuesd g,t in terms of the constraints (40)-(41). The second set of prescribed values are those of the initial state. A complete list of given values are as follows: The initial pressures p v,0 for each node v ∈ V, the in-and outflow values q l,a,0 and q r,a,0 for each pipe (l, r) = a ∈ A pi , the flow values q a,0 for each non-pipe arc a ∈ A \ A pi , the operation mode of the network station fixing m om o,0 for each o ∈ O, the thereby determined values of the variables m op a,0 , m by a,0 , m cl a,0 , m ac a,0 for each corresponding valve and compressor station arc a ∈ A va ∪ A cs , the values of m cf c,a,0 for each compressor station a ∈ A cs and corresponding configuration c ∈ C a , and finally the modes m by a,0 , m cl a,0 , m ac a,0 of all regulators a ∈ A rg . All these variables are actually parameters of the model and fixed to the corresponding value.

Objective
As already describe above, the objective function should punish deviation from the given future scenario, while simultaneously favoring those solutions with a stable control of the single elements. While the first part can easily be described by using the slack variables introduced in Section 2.8, we still need to define a measure for the stability. To do so, we first quantify the discrete changes of the control in binary variables, i.e., the change of the network station into a new operation mode at time t in variable δ om t as well as the change into a new mode of regulator a at time t in variable δ rg a,t . Furthermore, we capture the start of compressor unit u at time t in the variable δ us u,t , since starting a compressor is a very time and energy intensive action and should therefore be avoided if possible. The mode changes of valves and compressor stations are not tracked separately, as the start-up of compressor stations is already penalized and the change of valves only happens for multiple elements at once in the context of operation mode changes, see the Equation (34). This described variable behavior is realized by the following constraints for each future time step t ∈ T , where we denote the set of configurations of the containing compressor station a ∈ A cs which use compressor unit u ∈ U a by C u a.
In order to obtain a smooth operation for network situations without discrete mode switching, we add variables tracking the change of the operation point of single elements, i.e., their corresponding changes in flow, incoming pressure and outgoing pressure. We do this for all elements with an actual operation point, i.e., regulators and compressor stations in active mode, while ignoring times in which the network station operation mode or regulator mode has just been changed. The variables δ rg−pl representing changes of the incoming pressure, outgoing pressure and flow of an active regulator a respectively δ cs−pl representing the corresponding value changes of an active compressor station a can be established using the following constraints for each (l, r) = a ∈ A rg and each t ∈ T respectively for each (l, r) = a ∈ A cs and each t ∈ T q a,t − q a,t−1 ≤ δ cs−q a,t + (m by a,t + m cl a,t + δ om t )(q a,t −q a,t−1 ) (65) q a,t−1 − q a,t ≤ δ cs−q a,t + (m by a,t + m cl a,t + δ om t )(q a,t−1 −q a,t ).
Note that we needed to define the upper bound constraints for the discrete change variables δ om t and δ rg a,t to allow them to be equal to one only if there really is a discrete change. Otherwise it might have been possible to set the change variable to 1 although there is no actual discrete change and thereby avoid high costs imposed by the continuous change variables, which is not a desired behavior.
Finally, we are able to state our objective function, which minimizes the weighted sum of the change variables and the slack variables defined in Section 2.8 as where the w * parameters denote the corresponding positive weights given to the single quantities. Note that the weights for pressure and flow slack variables are additionally multiplied by the length of the corresponding time interval. We do this, since the slacks track deviation amounts over time and we therefore need to take the time interval length into account for their objective function value coefficients.

Final model
Putting everything together, we can formulate our problem in the following transient gas flow model P: Note that we apply the constraints only starting from time step 1 explicitly excluding the initial time step 0. We do this since the initial pressure and flow values as well as the initial modes described in Section 2.8 are not guaranteed to fit our model, and simply serve as a starting point for the calculations.

Feasible operating range of compressor station configurations
As already explained in Section 2.1 each compressor station arc (l, r) = a ∈ A cs has an inherent substructure. It represents a set of compressor units U a , which are the actual compressing elements and can be combined in a serial and/or parallel fashion, to either allow for a higher compression ratio, a higher flow rate or a mixture of both. The set of all feasible serial-parallel combinations is called the set of configurations C a of a compressor station. For each of these configurations, we will describe in this section how to obtain their polytope description given as intersection of the set of half spaces H c , which is used in the model as described in Section 2.1. The polytope of a configuration is created based on the polytopes of the compressor units, which are created from combining the corresponding feasible operation ranges with a maximum power restriction. Note that we drop the time index in this section for the ease of notation.

Feasible operating range for a single compressor unit
A compressor unit is a combination of a single compressor machine (or just a compressor), which increases the gas pressure in flow direction, and a corresponding drive, providing the power needed to run the compressor. For each compressor machine we are given a feasible operating range as polytope in the space ( pr p l , Q), where the volumetric flow rate Q is given as Note, that since we explicitly consider density and pressure at the incoming node l of the compressor, we use the compressibility factor z l := z(p l,0 ) instead of z a , here as well as in the rest of this Section. Usually, the feasible operating range, sometimes also called "characteristic diagram" or "performance curve", is given as area in the dimensions (H ad , Q) restricted by a set of possibly concave quadratic curves, see e.g. [10] [25]. The quantity H ad denotes the specific change in adiabatic enthalpy and is defined as using for the isentropic exponent κ the constant value 1.296, as stated in [10].
The transformation of such a feasible operating range using H ad into our format is easily doable, since there is a unique transformation from H ad to pr p l obtained by simply rearranging (68). The diagram then just has to be linearized by approximation or relaxation to obtain the polytope description in the desired space.
In addition to the feasible operating range polytope, each compressor machine is given an upper bound on the absolute pressure increase∆ p ≥ p r − p l and an upper bound on the maximum power to useP based on the power of the compressor drive. The power needed for compression depends on the above defined H ad as well as the mass flow and is given as Here η ad denotes the adiabatic efficiency of the compression, which in theory depends on the actual point of operation in the feasible region but is here assumed to be a given constant per compressor unit. An example of a feasible operating range of a compressor unit is given in Figure 3a, where different levels of the maximum pressure difference bound and the maximum power bound are given based on different values for the incoming pressure p l .  Ignoring the power bound for a moment, we will now lift the feasible operating range into the (p l , p r , q) space we are interested in. Therefore, we first transform each of the faces a 0 + a 1 Q + a 2 pr p l ≤ 0 of the original polytope into constraints a 0 p l +ã 1 p r +ã 2 q ≤ 0 of the higher dimensional space using the equation of state for real gases (15): To bound the polyhedron described by the new constraints, we add the restriction of the absolute pressure difference as well as two pressure bounds of the end nodes of the compressor station arc (l, r) = a ∈ A cs containing this machine A picture of the three dimensional polytope resulting from the feasible operating range of Figure 3 can be seen in Figure 4a.

Power bound linearization
Until now, we have ignored the maximum power constraint in the three dimensional feasible operation range polytope, that restricts the value of the available power P for compression. Figure 3a shows the constraint P ≤P cuts into the original two dimensional feasible operating range in a non-linear and nonconvex fashion. The same holds for the feasible operating range representation in (p l , p r , q). In the following, we are going to derive a linear approximation to this constraint that can then be added to the operating range polytope. Therefore, we generate a set of N random sample points from within the three dimensional operating range polytope, which we represent by the vectors p l , p r , q ∈ R N . We used a rejection-free sampling method, which is based on the sampling from a 3D tetrahedron, see [30]. The idea behind this method is to sample uniformly from a through the tetrahedron generated parallelepiped and then project the sample back to the tetrahedron. To extend this method from tetrahedra to general polytopes, we first compute a triangulation of the polytope. We compute the volumes of the tetrahedra and use their relative volume as a probability function. This allows us to first randomly select a tetrahedron and then sample from it using the in [30] described method. Our approach has high preprocessing costs of computing the triangulation tetrahedra and their volume but then it is very cheap to generate samples. On the contrary for rejection based sampling there is no lower bound for how many samples we need to generate for a single sample point, since the polytope can be arbitrarily small in comparison to the space we sample from (typically the enclosing axis-parallel cube). For the operating range polytopes we observed a significant speed-up while generating 50000 samples.
For each point, we then determine the corresponding compressor power using Equation (69) and store these values again in a vector P ∈ R N . The goal is now to obtain a linear approximation of the power function, i.e., an approximation of the form P ≈ a 0 + a 1 p l + a 2 p r + a 3 q.
We achieve this by applying an ordinary least-squares method in order to determine the coefficients of the linear function as min a0,a1,a2,a3 || P − (a 0 + a 1 p l + a 2 p r + a 3 q) || 2 .
Finally, we can formulate the new linearized power bound constraint based on the obtained solution values (ā 0 ,ā 1 ,ā 2 ,ā 3 ) as a 0 +ā 1 p l +ā 2 p r +ā 3 q ≤P, which is then added to the three dimensional polytope to create the final three description of the feasible operating range of a compressor unit. An example the final polytope is illustrated in Figure 4b, while the linearized power bound projected to the original two dimensional operating range can be seen in Figure 3b

Feasible operating range for a compressor station configuration
As the final step, we will create the polytope description for each configuration by combining the polytopes of the used compressor units. The procedure was originally described in [18], while we exactly follow the steps of the variant described in [32]. Each configuration c is given as a serial sequence s 1 , . . . , s nc of parallel compressor machine arrangements, where combining compressors in series allows for higher output pressures by multi-step compression while parallel compression increases the throughput in terms of flow. We call such a parallel machine arrangement stage and denote by U s the set of compressor units combined in stage s ∈ {s 1 , . . . , s nc }.
We will now start with the definition of the feasible operation range polytope P s of such a stage s ∈ {s 1 , . . . , s nc }. For each compressor unit u ∈ U s the corresponding polytope is denoted by P u , and using this we can describe P s as P s := { (p l , p r , q) | ∀u ∈ U s ∃(p l,u , p r,u , q u ) ∈ P u with p l = p l,u ∀u ∈ U s , p r = p r,u ∀u ∈ U s , q = u ∈Us q u }.
In other words, a valid operation point of the stage is represented by each unit operating at the same incoming and outgoing pressures, while the the mass flow through the stage is the sum of mass flow through the individual units.
In a similar fashion, we can then define the polytope P c for the overall configuration. Here, the mass flow through all the stages stays the same, while the outgoing pressure of some stage s i in the sequence has to match the incoming pressure of the subsequent stage s i+1 . Using this logic P c can be defined as P c := { (p l , p r , q) | ∀s ∈ {s 1 , . . . , s nc } ∃(p l,s , p r,s , q s ) ∈ P s with p l = p l,s1 , p r = p r,sn c , p r,si = p l,si+1 ∀i ∈ {1, . . . , n c − 1}, q = q s ∀s ∈ {s 1 , . . . , s nc } }.
Finally, the set of half spaces H c used in Section 2.1 to define the feasible operating range of the configuration c are simply the facets of the polytope P c representing the feasible operating range.
Note, that due to the symmetric polytope creation the parallel compressor units of a stage do not need a specific order. In contrast, the serial stage sequence is indeed important since in general two sequences using the same stages but in a different order will have different operating ranges.

Specialized network station algorithm
The MIP model P presented in Section 2.10 turns out to be quite challenging according to our experiments even though we are only considering parts of a larger gas network by restricting ourselves to network stations. We therefore created a specialized algorithm to solve the problem for network stations. The baseline insight of it is that the elements in network stations are all very close to each other, making the corresponding pipes inside the station relatively short, see also Table 1 in the computational Section 5. This means that their capability to store gas, which is often referred to as linepack, is insignificant in comparison to the long pipelines in between the network stations. Thus there is no possibility to "prepare for the future", i.e., use linepack to pre-transport gas to handle upcoming critical demand situations inside the network station itself.
This led us to the idea of splitting the time coupled model P into individual stationary models to determine the best operation mode and flow direction for each individual time step. We then use this information to find a solution in terms of operation modes and flow directions over the whole time horizon. This solution will respect the transition time conditions, which have not been modeled in P explicitly, see the corresponding part of Section 2.7.
Since our goal is to find a feasible solution for the presented model P of Section 2.10, we will not only have to determine the network station operation modes and flow directions, but also all other involved quantities. Hence, after determining these values we still need to calculate a transient version of P, which for example also takes care of minimizing the differences in the operation points of the single elements, see Section 2.9. We do so by prescribing the operation modes and flow directions determined in the stationary calculations to make the model tractable. The whole algorithm is summarized in Algorithm 1, in which the creation of the operation mode sequence is already split into the two steps described in Section 4.2. Note, that our algorithm does not have the guarantee of finding globally optimal solutions, which would have been the case when directly solving P enhanced by a proper description of the operation mode transition time constraints. However, it has the potential to obtain good overall solutions for P, since we assume the operation mode depending objective function weights w om for changing to a new operation mode and w us for starting a new compressor unit to dominate the other objective weights, see also the used objective function values in our computational experiments stated in Section 5.1.
In this section we will first introduce in 4.1 the three different variants of the model P that we will use in our algorithm. Afterwards, we will present the algorithm itself, that determines operation modes in 4.2 and finish with the so-called smoothing procedure in 4.3, in which we will find the feasible transient solution to our problem.

Model variants
As mentioned above, we will not solve the complete transient model P directly, but use the following variants of it in our overall solution approach. P s -Stationary model For the first variant we solve a stationary version of the model to determine the best operation mode and flow direction for one independent time step t. The necessary changes in P mainly affect the pipe model. In the stationary case a pipe no longer has the possibility to store gas, since the incoming and outgoing pipe flows are balanced, i.e., q l,a,t = q r,a,t = q a,t for all pipes (l, r) = a ∈ A pi . This is due to the Continuity Equation, in which ∂ t p = 0 holds as for all time dependent derivatives resulting in the mass flow balance. Hence, the Continuity Equation is no longer part of the model and the stationary Momentum Equation (20) for pipe (l, r) = a ∈ A pi and the stationary time step t we are considering can be stated as Note that we still calculate the fixed velocity based on the initial pressure and flow values from time step 0, since we are looking for a feasible solution for the original model P in the end. For all other elements the constraints only consider exactly one time step and we therefore simply apply them for the one time step t of the stationary case. The only other part to adjust is the objective function, where we keep the penalties σ p+ v,t , σ p− v,t , σ d+ v,t , and σ d− v,t for each boundary node v ∈ V b since they are defined based on each individual time step t. In addition, we will keep tracking the change to a new network station operation mode in variable δ om t as well as the start of new compressor units using δ us u,t for all u ∈ U a a ∈ A cs . This is done by calling the model with the parameter prevMode, where prevMode represents the operation mode of the previous time step. The other variables tracking changes in regulator modes δ rg a,t as well as the current point of operation of regulators and compressor stations as δ rg−pl , and δ cs−q a,t for all a ∈ A rg respectively a ∈ A cs will be removed from the model, as well as the constraints (48)-(53) respectively (55)-(66) defining their behavior. The final stationary objective function for the time step t under consideration reads as and m fd ft,t to 1 for all future time steps t ∈ T , the majority of binary variables can be replaced by constants, since the operation modes already decide valve and compressor station modes, as well as the configuration of all active compressor stations. Only the binary variables m ac a,t , m by a,t and m cl a,t for the mode of a regulator a ∈ A rg are still to be decided.
In addition, a lot of implicating big-M constraints can already be resolved, i.e., we can remove the current formulation and just add the implied constraints, if the corresponding condition is fulfilled. Examples for this are the constraints (22)-(25) describing the valve behavior or the flow direction conditions (43). Furthermore, we no longer need to use a disjunctive model, but can apply the corresponding constraints (2)-(12) of the active mode and/or configuration of time step t directly to the variables p l,t , p r,t and q a,t for each compressor station (l, r) = a ∈ A cs . P sf -Stationary with fixed operation mode As a last variant we basically combine the two variants above and use the stationary version of the model with already fixed operation mode. Note, that in contrast to model P f the flow direction of the network station is not already given, which results in more binary variables and still to decide big-M constraints. However, this variant still results in a very small and rather simple model and we can therefore solve it very often to test the appropriateness of a given operation mode for a certain time step.

Determining operation modes
Our algorithm to determine the operation modes of the network station is split into two steps: First, we create an initial solution by a greedy, forward oriented procedure presented in 4.2.1. We then in a second step improve this solution by testing if certain operation modes can be replaced by similar ones to find a better sequence of operation modes over time. This second step is described in 4.2.2.
Regarding the flow directions we will return for each time step t that flow direction, which has been chosen in the optimal solution of the stationary model which determined the returned operation mode for t. For simplicity of notation, we will not further mention them in the rest of this section.

Initial solution creation
To find a first feasible sequence of operation modes over time, we follow a rather simple idea. In order to keep the number of needed operation mode changes small, we determine an operation mode for time step t by first testing the used operation mode of the previous time step t − 1 using P sf . Only when this previously used operation mode yields a costly solution in terms of the objective function value will we use the general stationary model P s to determine the best operation mode for t. By this mechanic, we also reduce the amount of calls to the P s model, which are in general much more expensive in terms of computing time then calls to the P sf model. A detailed description is given as Algorithm 2.
There are a few things to note about Algorithm 2. First, the parameter prevMode given to the calls for solving the models P sf in line 5 and P s in line 13 is the operation mode of the previous time step, which we need to determine the operation mode change and compressor unit start variables δ om t respectively δ us u,t for some unit u ∈ U a a ∈ A cs and t ∈ T as explained in Section 4.1. In the call to P s we furthermore give the parameter validModes, which replaces the set of valid network station operation modes O. We also call the functions modeAvailable and transitionsWork in Algorithm 2. These refer to the operation mode unavailability and the transition time restriction introduced in Section 2.7, where modeAvailable checks for a given operation mode o and time t if o is available at t and transitionsWork performs the checks described in Section 2.7 to test if a given operation mode sequence is valid regarding the corresponding transition times. In addition, Algorithm 2 uses a function called notSoonInfeasible. Here, we check if choosing a new operation mode for time t would result in an infeasibility at one of the subsequent time steps caused by a combination of operation mode unavailability and too long transition times. More specifically, we check if the new operation mode for time t will become unavailable in one of the future time steps. If this is the case, we then test if there is enough time left to transition into another operation mode until then, also taking into account the time needed by the transition from the old operation mode at time t − 1 to this new operation mode at time t. This look into the near future turned out to be necessary according to our computational experiments in order to avoid cases where Algorithm 2 gets stuck in infeasible situations.
Two other lines in the algorithm may require further explanation. In line 6 we decide if the previous operation mode is good enough for the current time step by comparing its stationary objective function value against the cost of an operation mode change w om . If the objective function value is indeed smaller we know that the previous operation mode is the best option considering this individual time step given the chosen operation modes for the past time steps, since each other operation mode would at least have to pay the penalty of w om for changing the operation mode. As a last point to mention, Algorithm 2 may abort without a Data: Operation mode o 0 of the network station in the initial state Result: A list of operation modes for each time t ∈ T 0 1 operationModes ← list() 2 operationModes.add(o 0 ) 3 for t ∈ T do 4 oldMode ← operationModes.last() // call P sf with fixed operation mode oldMode for time t 5 oldModeFeasible, oldModeCost ← P sf (oldMode, t, prevM ode = oldMode) 6 if oldModeFeasible and modeAvailable(oldMode, t) and oldModeCost < w om then // operation mode of t-1 is also good for t if not bestModeFeasible then abort without solution // bestMode is best choice for t 15 operationModes.add(bestMode) 16 return operationModes Algorithm 2: Initial solution creation feasible solution in line 14. This is no proof of infeasibility, since in theory cases are possible, where we abort although a feasible solution exists. However, in all of our test cases we have never aborted the algorithm at this point, also see our computational results in Section 5.3. Furthermore, the combination of transition times and unavailable operation modes can lead to very hard to find feasible solutions, which makes the design of an algorithm performing reasonably fast on average but guaranteeing to find all feasible solutions a challenge. Therefore, we leave this problem open for future research.

An improvement heuristic
After we have found a feasible solution using Algorithm 2, we look for further improvements of it. Due to its design, Algorithm 2 only considers individual time steps to decide which operation mode to choose for each time step. However, we can easily imagine a situation in which the operation mode o 1 found by P s is best for time step t, but another operation mode o 2 is slightly better for all subsequent time steps and would have been the overall better choice at time t. We might even be able to avoid operation mode changes, if o 1 would become unavailable in the future, while o 2 has only slightly worse objective function values, but stays available.
Data: A sequence S of valid operation modes over time Result: A valid sequence S * of operation modes with obj(S * ) ≤ obj(S) 1 backwards ← T rue 2 while not having two iterations without improvements do To deal with these situations, we created for a given feasible solution, represented by a sequence of operation modes over time, the improvement heuristic stated as Algorithm 3. Here the idea is, to identify all sequences of identical operation modes over time in the solution. We call these sequences stable phases or just phases of a feasible solution. Obviously, the switch from one phase to the subsequent one happens if the operation mode changes to a new mode. For each of these phases we then check if we can replace the operation mode of the whole phase with a similar one being more beneficial in terms of the objective function value.
We obtain these similar network station operation modes from the call of the function convexCombination, which is the key feature of Algorithm 3. To define it, we use the the function M (o, a) returning the mode or active configuration of a valve or compressor station a in operation mode o, see Section 2.7. Furthermore, we denote by U(x) the compressor units used in mode or configuration x ∈ {by, cl} ∪ C a for some compressor station a ∈ A cs , where U(by) = U(cl) = ∅.
Then we first define the function convexCombinationCS on a tuple (x, y) with x, y ∈ {by, cl} ∪ C a as convexCombinationCS(x, y) := {x, y}∪ c ∈ C a | ∀u ∈ U(x) ∩ U(y) : u ∈ U(c) ∧ ∀u ∈ U(c) : u ∈ U(x) ∪ U(y) . Note here, that while we allow a compressor station a to have a configuration using a compressor unit set "in between" the used compressor unit set of the configurations used in o 1 and o 2 for a, we only allow the exact valve mode combination used in o 1 or the one used in o 2 . The reason for this is, that a valve mode combination enables a very specify set of paths through each network station, and it is very unlikely that a valve mode set obtained from combining the modes used in the two given operation modes yields operation modes, which are able to handle the same demand situation. Since, each of the modes obtained by calling convexCombination is tested in Algorithm 3, we hereby restrict the result set to the most promising candidates.
Apart from calling convexCombination, Algorithm 3 uses the two functions transitionsWork, which works in the same way as described for Algorithm 2 above and allModesAvailable, which is similar to modeAvailable from Algorithm 2, but instead of checking the availability of a given operation mode o for time t checks the availability of a whole sequence of operation modes at the times corresponding to the position in the sequence. Furthermore, we evaluated the objective function value of a sequence of operation modes using the function obj by successively calling P sf for each operation mode and time corresponding to its position in the sequence. If one of the models turns out to be infeasible, the returned objective function value will be infinity.
Finally, we note that we decided to start the algorithm in the backwards oriented mode. The reason for this decision is that the initial solution is obtained by Algorithm 2 which was operated in a forward direction. Furthermore, we highlight that Algorithm 3 has the potential to reduce the total number of needed operation mode changes, since the two original operation modes are always part of the result of convexCombination. In addition, it is possible that an operation mode change from the loop of line 5 has already been removed in the previous iteration by replacing one of the involved operation modes with the other one, which makes the check in line 6 necessary.

Transient solution smoothing
As a final step of our specialized network station algorithm, we solve the transient model variant P f with fixed network station operation modes and flow directions.
We obtain both for each time step from the stationary model solutions created in the previous steps of the algorithm. When comparing pressure and flow values at single points of network over time, we expect the transient solution states to be more similar in general and in case of changing conditions to be more smooth compared to the series of stationary solution states. This is due to the missing penalty of operation point changes in the independent stationary models, which may result in considerably different solution states. This difference may for example occur in the pressure level of nodes inside the station, even if the demand situation as well as the determined operation mode and flow direction are the same. In our computational experiments we observed that even though most of the binary decision variables of P are fixed in P f , only a limited number of time steps can be solved for large network stations. Therefore, we use a rolling horizon approach to solve P f , which is described in Algorithm 4. Here, we specify a small fixed time horizon size h, which represents the number of time steps to solve in model P f including the time step for the given initial state. We then solve a series of models P f , while always fixing the earliest time step and shifting the time horizon by 1 in each iteration. In the function call to solve P f in Algorithm 4, we give the subsequence of operation modes and flow directions, corresponding to and also encoding the current time horizon to solve. Furthermore, we specify the state to use as the fixed initial state.
The main benefit of this method is, that increasing the size |T 0 := {0, . . . , k}| of the overall time horizon only increases the number of equally sized and therefore similarly complex MIP models to solve rather then increasing the complexity of the model, which may lead to an exponential increase in runtime.

Computational experiments
In order to verify the competitiveness of Algorithm 1 in terms of both solution quality as well as execution time, we evaluated it to a large number of test instances. These instances represent network stations in the network of our project partner Open Grid Europe(OGE), for which we generated scenario values based on historic real-world situations in the network.
Throughout this chapter, we will state all flow values as volumetric flow under normal conditions in the unit 1000m 3 /h. To create these values from mass flow values in kg/s we use the normal density ρ 0 given in kg/m 3 , which is determined by the gas mixture and assumed to be constant all over the network. The linear transformation formula then reads as [1000m 3 /h] = 3600 1000ρ0 [kg/s].

Instances
We consider 7 different network stations from the network of OGE with different sizes and properties. An overview can be found in Table 1. For each of these stations, we have a set of 159 instances. These have been created by solving a macroscopic gas flow model on an aggregated complete network containing simplified versions all the network stations. A full description of the model can be found in [17].
Name |V| |A|  As initial states we used state values that actually occurred for the network of our project partner. Analogously we are given measured pressure and flow values for the next 12 hours after the initial state time at the boundary nodes of the network containing all the 7 network stations. These instances are distributed over a period of 91 days. For each day we have two instances, whose initial state times have a difference of 12 hours. Inside this data period, we unfortunately are missing instances for 8 days as well as having only one instance for another 7 days due to technical problems during the data creation at our project partner. Therefore, our final instance set consists of 2 · 91 − 2 · 8 − 7 = 159 instances.
The solutions of the macroscopic gas flow model for each of these instances yield pressure and inflow values at the boundaries of each network station. These values represent the scenario values for each instance of the test set of this paper.
For the macroscopic gas flow model, the time horizon of 12 hours has been split into 4 periods of 15 minutes followed by 11 periods of 60 minutes, resulting in 15 future time steps in total.
For our analysis, we created the following four different partitions of the 12 hour time horizon, where the horizon is built from left to right per row. 12

Computational setup
We performed our computations on a cluster using 4 cores and 16 GB of RAM of a machine being composed of two Intel Xeon CPU E5-2670 v2 running at 2.50 GHz. As a solver for the underlying MIP problems we used Gurobi in version 8.1.0 [12], which we accessed via the Pyomo modeling language [14] [15]. Since the corresponding MIP models are numerically challenging we used the solver with maximal NumericFocus parameter. In addition, we specified in Table 2 the optimality conditions and maximum run times for each solved model variant.

Variant
Rel. Gap Abs. Gap Time limit Finally, we specify the rolling horizon parameter h to be 4, so we are always solving the smoothing for 4 future time steps. We found in our experiments, that this number represents a good trade-off between efficient model solving speed and foresightedness of the obtained solution.

Results
As described in Section 5.1 above, we tested Algorithm 1 on 159 start times × 7 network stations×4 time horizons = 4452 instances. For all of these instances, it found a feasible solution. In particular, we always found a feasible solution in the initial solution creation in Algorithm 2 described in Section 4.2.1 although this is not guaranteed by the algorithm itself. In Figure 5 one can find an overview of the average run times of Algorithm 1 sorted by station and number of solved time steps. As one can see, the ranking of the single stations regarding run time as well as the ratio between the single stations run times is stable over the different time horizon partitions. Furthermore, the run time per station corresponds roughly to the complexity deduced from the station statistics in the sense that more nodes, arcs and operation modes make the overall problem more complex to solve and therefore increase the run time. The exception to this reasoning is station C, which has a rather high average run time although its node and arc counts are comparable to station D and the number of operation modes is even the smallest of all seven stations. We assume that this is due to the stations topology, in which the regulators seem to be arranged in a way which is hard to solve for the final smoothing step of the algorithm. Another noticeable fact is the almost perfectly proportional average run time increase with increasing number of time steps. As mentioned in Section 4.3 this property was an explicit goal when designing the algorithm.
In addition to the average run times, we also depict the distribution of run times sorted by station in Figure 6 for instances solved for 12 time steps. The distribution of the other time steps are quite similar in appearance and can be found in the Appendix as Figures A.1, A.2, and A.3 respectively.
For the fast solving stations A, B, and D, the run times of the instances are all very similar in that the middle box is barely visible. Station C is again outstanding by having a rather large spread of run times only matched by the biggest and most complex station G. However, in general the difference between the median and maximum run time are lower than an order of magnitude for all of the stations. That means that for our instance set even the extreme cases are still in reach and have a somewhat similar run time to the rest of the instances, Figure 6: Run times of instances with 12 time steps, sorted by station. One dot represents one instance, however dots may overlap when too many have similar run time. In addition we draw a typical whiskerless box plot. Therefore the lines represent the 25th percentile, the median and the 75th percentile.
making Algorithm 1 well suited for strict time limit situations, which we would face in a production environment.
As the last run time related graphic, Figure 7 shows the average portion of run time spent in the single subroutines of Algorithm 1 for 12 time steps as well as for 96 time steps. It states that the transient smoothing dominates the run time already for 12 time steps. With an increasing number of time steps, the influence even increases, while the improvement heuristics impact is decreasing. This can also be verified through the 24 and 48 time step instances displayed in Figure A.4 in the Appendix. The observed shares fit to the structure of the single subroutines from Section 4. While the initial solve as well as the smoothing have to solve more instances of the challenging model variants P s respectively P f with each additional time step, the improvement heuristic iterates over pairs of subsequent time steps with different operation modes. Since the differently sized time horizon partitions all cover the same 12 hours, the number of total operation mode changes in the result of the initial solve is very unlikely to increase much with increasing time granularity. This is also due to the initial solve algorithm itself, which tries to use the same operation mode as long as possible before changing to a different one.
Apart from the run time we are also interested in the quality of the solution of Algorithm 1. Therefore, we solved the complete model P directly on the smaller instances A to E to obtain a valid lower bound for them. Note that P itself is already only a relaxation of the problem solved by Algorithm 1, since the transition time restrictions for operation modes mentioned in Section 2.7 are not included in this model. The obtained lower bound can be used to give an upper bound on the relative difference of the solution found by Algorithm 1 to the optimal solution. We call this difference gap and define it as obj(solution)−lowerBound obj (solution) . Since the lower bound cannot be negative, the gap is always smaller than or equal to 100%.
We solved model P on stations A to E for 12 time steps using the same com-  putational setup as described above with a time limit of 10 hours. Furthermore, we gave the solution obtained from Algorithm 1 as a starting solution. The solver finished all 159 instances of the stations A and B within the time limit. For the stations C, D and E we hit the time limit for 47, 8 and 18 instances respectively, resulting in potentially sub-optimal lower bounds in these cases. For the stations F and G more than half of the instances did not finish in time, which is why we did not include these in our evaluation. The results can be found in Figure 8, where we plotted the gap for all 5 network stations on a logarithmic scale and then plotted the same data for stations C and E on a linear scale to better depict their distributions. Note that the instances which have a potentially suboptimal lower bound are marked with triangles instead of dots. As one can observe, we have good results for all of the network stations with more than half of the instances for each station having a solution with at most 10% difference to the lower bound. For the best three stations A, B, and D we can even state that at least 75% the of values have less than 1% difference to the optimal solution. The picture is more diverse for the other two stations, in which the 75% percentile is between 20% and 30% gap. However, for those stations we have the highest value of instances for which P did not solve the problem in time. Therefore, there is the potential for further improvement of the gap by increasing the lower bound here. For all stations C, D and E with instances having potentially suboptimal lower bounds, the corresponding instances are among those having high gap values. The picture is extreme for Station C, where all but 3 instances above the 75% percentile had suboptimal lower bounds, making it very likely that one can further decrease the gap value by increasing the run time limit of P directly.
Altogether, the results show that Algorithm 1 is able to find good solutions in a short time and is therefore able to solve even large instances of the presented gas transportation problem, which would be far out of reach when trying to solve the original MIP model formulation P directly. Figure 8: Gap between the solution of Algorithm 1 and a lower bound obtained from solving P directly. We considered the stations A to E and solved them using 12 time steps. One dot represents one instance, however dots may overlap when too many have similar run time. The instances with potentially suboptimal lower bounds due to hitting the time limit are marked by triangles instead of dots. In addition we draw a typical whiskerless box plot. Therefore the lines represent the 25th percentile, the median and the 75th percentile. The left graphic uses logarithmic scale for the gap and the right one displays the same data for stations C and E on linear scale. For all instances in which the objective function value and the lower bound are smaller than 0.1 we defined the gap to be zero. Furthermore we plotted all values below 10 −3 as 10 −3 for the logarithmic scale.

Conclusion
We presented the transient gas network transportation problem on so-called network stations, which represent the intersection points of major transportation pipelines and contain the majority of active elements to control the network. For this problem we introduced a mixed-integer programming model including a complex model for compressor stations as well as additional variables and constraints for the network station itself. For the pipes, we found that due to their shortness they have less overall impact in network stations, which enabled us to use a linear description for them. Using state-of-the art solvers the MIP Model is not tractable for large network stations. Therefore we developed a specialized algorithm to solve the problem. Here we again use the fact that network stations contain only short pipes and therefore negligible linepack, causing the decision making to mostly depend on the current demand situation at the boundary of the station. Therefore, we determined the operation modes and gas flow directions of the network station based on multiple solves of a stationary version of the MIP. By using this approach we are also able to satisfy transition time constraints for the operation modes, which have been excluded from the original MIP formulation. In order to obtain a feasible solution for the problem, we finally solved another variant of the MIP with fixed network station operation modes and flow directions in a rolling horizon fashion.
To verify the competitiveness of our algorithm regarding both run time and solution quality, we did tests on 159 different scenarios based on past flow situations in 7 real world network stations provided by our project partner. Our algorithm is able to compute feasible solutions for all of the presented instances very fast. Even for the biggest of the presented network stations consisting of more than 100 nodes and arcs as well as over 1000 different network station operation modes, our algorithm terminates on average in under 20 minutes for each of the stations. By running experiments using 4 different types of granularity, we found that the run time increases proportional to the number of time steps used, indicating that our algorithm scales very well. In terms of quality of the solution, we tested the results against a lower bound obtained from solving the original MIP formulation for 12 time steps on all except the biggest 2 network stations. For the worst of the 5 remaining stations, more than half of the instances have a solution with at most 10% difference to the lower bound and for the best 3 stations, we find near optimal solution with less than 1% difference for more than 75% of the instances. Altogether we were able to show, that our algorithm can reliably find good solutions to the problem in a short amount of time.
There are a lot of different possibilities to continue this research. To increase the model accuracy, the approximative linearization of the friction in pipes and resistors as well as the maximum power bound of compressor units could be replaced by their original non-linear versions. This would turn the model into a MINLP and thereby increase its complexity. From a theoretical point of view, extending Algorithm 2 such that it has the guarantee to always find existing feasible solutions would greatly improve the overall robustness. Finally, realworld gas network operation is a complicated business with a never-ending list of special elements and extra constraints, which can still be added to our model. As examples we name ramp-up and cool down times for compressor units as well as target value based control of regulators and compressor stations.