1 Introduction

As one of the biggest railway freight operators in Europe, Rail Cargo Austria (RCA) controls 653 locomotives and around 24,000 freight cars, transporting approximately 68.5 million tons of cargo per year through the Austrian railway network (OEBB 2020). Transporting cargo through a railway network requires many resources and causes high expenses. Thus, railway companies need to apply strategies that enable cost-efficient and smooth logistical processes. In the following we present two approved strategies with different characteristics.

The first strategy couples freight cars to so-called block trains. A block train or direct train is defined as a train that travels from its initial station to its destination without stopping at intermediate stations, and thus, no further couplings are done. While block trains are the fastest way of railway freight transportation, they cause significant expenses as their operation requires many resources such as locomotives, locomotive drivers, freight cars, and rail line capacities. As a simple rule of thumb, to stay economically sustainable, at RCA a block train should consist of at least 15 freight cars. This strategy can be applied for large customer orders as well as for shuttles for intermodal container traffic operating between two terminals.

The second strategy is called single freight car transportation and is tailored for customers with smaller transportation volumes, requiring at least one freight car. In comparison to block trains, the process of single freight car transportation is more complex. This strategy involves the concept of a so-called routing matrix, i.e., freight cars have to follow a predefined path based on their origin and destination. Moreover, if any two freight cars with the same destination meet in the network, they have to use the same path to their destination. Applying a routing matrix to a freight railway system does bring advantages as well as disadvantages with it. For more details about the network topology of RCA, as well as the routing matrix with all its benefits and downsides we hereby refer to Sect. 1.1.

The problem of obtaining a schedule for single freight cars is typically solved in two steps. First, the routes for freight cars in the cargo network are obtained by solving the Single-Car Routing Problem (SCRP). The solution of an SCRP contains only the routes for freight cars, without any train departure and arrival times. A freight train schedule that incorporates train frequencies and departure times is determined in the second step by solving the Freight Train Scheduling Problem (FTSP). Both problems aim to minimize operational cost while satisfying several constraints, e.g., train capacities, shunting yard capacities, and maximal allowed travel times of freight cars.

In this work, we develop an integrated mathematical model that solves both, the SCRP and the FTSP in RCA ’s railway network in one step, while simultaneously constructing a routing matrix. Based on a discretized time-space network, we minimize the total operational cost. While neither solving the SCRP, solving the FTSP, nor constructing a routing matrix is a novel idea, to the best of our knowledge, combining all three of the latter aspects in an integrated approach has not been considered before. Due to the arising computational difficulty when combining complex optimization problems, performance issues might have to be solved such that the approach can be applied for RCA in real life. The paper is organized as follows. In Sect. 1.1, we examine the structure of the single freight car routing network in Austria and give an example that illustrates the impact of a routing matrix. The main cost components linked to RCA ’s network are stated in Sect. 1.2. In Sect. 1.3, we introduce the Single-Car Routing Matrix and Train Scheduling Problem (SCRMTSP). Related research is presented in Sect. 2. In Sect. 3, we formally define a mathematical model for solving the SCRMTSP and propose an Integer Linear Program (ILP) based on a time-space network. Computational experiments are discussed in Sect. 4. Finally, we conclude and give an outlook for future work in Sect. 5.

1.1 Single freight car network in Austria

Rail Cargo Austria’s single freight car network follows a hub and spoke design and is divided into a primary and a secondary network. The primary network consists of 14 shunting yards and 67 out of \(\left( {\begin{array}{c}14\\ 2\end{array}}\right) =91\) potential connections between them. The secondary network consists of loading stations, as well as operational nodes. Customers deposit their shipments, i.e., freight cars, at loading stations. The freight cars are then collected by RCA, and gathered at operational nodes for further distribution to the primary network. Thus, the secondary network can be seen as upstream layer for the primary network.

Fig. 1
figure 1

Hub and Spoke Design Network RCA: Primary network connecting shunting yards (light gray nodes); Secondary network connecting the primary network with operational nodes (mid gray nodes) and operational nodes with smaller loading stations (black nodes)

Each loading station is assigned to exactly one operational node, and each operational node is assigned to exactly one shunting yard. As a result, there exists a unique path for freight cars from their loading station in the secondary network to the nearest shunting yard in the primary network. Therefore, we only focus on routing in the primary network.

Rail Cargo Austria’s railway network has been operated based on the concept of routing matrices for decades. The idea is that the path for a particular freight car depends only on its origin and destination. Moreover, if any two freight cars with the same final destination (but potentially different origins) meet at some shunting yard in the primary network, they have to follow the same remaining path. These paths are defined in a routing matrix that is generated based on company-internal rules and on customer demands. Using such a routing matrix has the big advantage that human errors, caused by manual handling of freight cars in shunting yards, can be significantly reduced. We refer to Fügenschuh et al. (2013) for a closer look on different routing strategies in rail freight transportation. Applying the concept of a routing matrix does not only result in benefits though. First of all, routing matrices are generated based on a certain demand scenario. In many cases this is done by using a forecast of the future customer demands. However, in cases of a low accuracy of the forecast, a wrong routing matrix would be used for the actual occurring demand. In turn, this would result in inefficiencies when operating the network. Secondly, it limits the flexibility when taking decision to a certain extend. Using a flexible routing strategy, meaning that orders can be transported through the network without consideration of rules as defined by the routing matrix, can lead to significant monetary savings. This can be reasoned by the fact that the number of train kilometers, number of shunting processes, as well as overall waiting times of freight cars in shunting yards usually decrease. We illustrate the impact on the overall number of driven train kilometers of both routing strategies in the following simplified Example  1.

Example 1

(Routing Matrix) We are given a network with train stations and corresponding connections including travel distances in Table 1(a). Moreover, we are given a set of orders with origin, destination and the total length of the related freight cars in Table 1(b). Assuming that a train is restricted to a total length of 400 meters and that the number of train kilometers has to be minimized, we would obtain the results as depicted in Fig. 2 for two different routing strategies.

On the left hand side we show the result obtained by applying the concept of a routing matrix. The corresponding matrix for this example is depicted in Fig. 3. If a tour between two stations exists, the corresponding matrix entry contains either a D for a direct connection, or all stations that must be visited consecutively. For order A, we obtain that there is a connection from Vienna to Linz and thus, the corresponding entry in the routing matrix is set to D. For order C, we obtain the connection from Vienna to Graz and from Graz to Salzburg. Thus, a train from Vienna to Salzburg must travel via Graz which is depicted by G in the routing matrix. When requiring a routing matrix, orders A and B are not allowed to travel via different routes from Vienna to Linz because they meet in the same shunting yard and have the same destination. As the capacity of Train 1 is already fully exhausted with order A, an additional train has to be used for transportation of order B. Summed up over the four trains, this results in a total of 808 driven kilometers.

Table 1 Input for Example 1
Fig. 2
figure 2

Results for Example 1 with application of a routing matrix concept on the left hand side, and the application of an individual freight car routing strategy on the right hand side

Fig. 3
figure 3

Routing matrix based on input of Table 1. The rows and columns correspond to the train stations Vienna (V), Linz (L), Graz (G) and Salzburg (S)

In comparison, the result for applying a flexible routing strategy is shown on the right hand side. If we do not require the application of a routing matrix, the freight cars can be routed individually. Order B is transported by Train 1 directly to its destination. Orders A and C can be transported jointly by using Train 2 and orders A and D can be transported jointly by using Train 3. All train length capacity limitations are satisfied, resulting in a total of 619 driven kilometers. Comparing the numbers of both routing strategy shows, that 30% more train kilometers are needed when applying a routing matrix constraint to the network. \(\lozenge \)

Considering the potential cut in operational cost, there would be good reason for applying a flexible routing strategy at RCA. Furthermore, facing changing demand scenarios would not depict a huge problem. However, RCA still prefers to stay with its current routing strategy, i.e., the application of a routing matrix. This can be justified by two reasons. First, as already mentioned, RCA is eager to reduce any operational disruptions caused by human errors. Furthermore, all decision support systems currently used for operational processes are tailored to the concept of a routing matrix. Adapting to a more flexible routing concept would thus not only mean a total re-design of the previous way of work, but also a new development of many IT-Systems. Since it is decided to stay with the concept of a routing matrix, the best solution would be to change the matrix on a quite regular basis, in order to adapt to changing demand scenarios. As generating routing matrices manually depicts a challenging, as well as time consuming task, a mathematical model can help to not only speed up, but also optimize this process. Therefore, routing matrices can be generated on short notice, making it possible to adapt to changing demand scenarios in a timely manner.

1.2 Cost factors

The main costs are caused by the resources required for infrastructure, locomotives, and freight cars. In this work, we incorporate these costs by considering three different factors: shunting cost, routing cost, and holding cost for freight cars.

Shunting cost per freight car correspond to the costs of handling the freight cars at shunting yards. These costs can differ strongly depending on the size of the respective shunting yard since shunting processes in smaller yards are typically more expensive than shunting in big yards. This can be mainly explained by the more efficient use of the deployed resources, when moving a higher number of freight cars at once, as it can be done in larger yards.

Routing cost of trains account for all costs that arise when a train is operated. Besides the energy consumption and the weight of the train, this cost factor also incorporates the expenses required for locomotives and train drivers. In this work, we assume an average cost rate per kilometer for weight based track usage, energy consumption, as well as locomotive and driver usage. We refer to this cost as costs for one train kilometer. From the three considered cost categories, cost for routing typically account for the biggest share.

Finally, holding cost for freight cars account for the total time the freight cars spent in the railway network, which consists of the travel time on rails and the time spent in shunting yards. The time spent in shunting yards can be divided into the actual time needed for the shunting process plus the time a freight car has to wait for its train to depart.

1.3 Problem statement

The process of routing single freight cars in a railway network is commonly known as SCRP. The aim is to ship all freight cars from their origins to their destinations, while simultaneously minimizing cost or maximizing profits. The SCRP helps to find optimal routes for the freight cars, plan shunting activities, and decide on the number of trains needed for the connections in the network. Usually, the SCRP allows an individual routing of freight cars, i.e., each freight car can take an individual route from its origin to its destination. However, in this work RCA ’s rule to apply a routing matrix is considered, and thus, we deal with an adapted version of the SCRP, which we call Single-Car Routing Matrix Problem (SCRMP).

Following RCA ’s network topology in Fig. 1, a train only travels on direct connections between two shunting yards without intermediate stops. The number of trains needed for connecting the shunting yards is determined by solving the SCRMP, usually under consideration of weight and capacity restrictions of the same. Finally, it remains to determine train departure and arrival times by solving the FTSP.

The SCRP and FTSP are often solved separately to decrease the difficulty of the problem. This is especially necessary when the plans are constructed by human labor. However, train schedules depend on the actual freight car routes and define the waiting times of freight cars in shunting yards and thus, have a significant impact on the freight car holding cost. Therefore, we aim to solve the SCRMP and FTSP in a combined way to obtain a global optimal solution. Hence, with Definition 1, we introduce the SCRMTSP.

Definition 1

(Single-Car Routing Matrix and Train Scheduling Problem) Let \(N:=(S,T)\) be a network with shunting yards S and directed connections T (tracks). Each shunting yard has a capacity of freight cars that can be processed at the same time. Each track has a weight and length capacity regarding the train that uses the corresponding connection. Let C be the set of orders with information about its origin \(o(c)\in S\), its destination \(d(c)\in S\), the number of needed freight cars, its total length and its total weight. Moreover, every \(c\in C\) has a ready time and a maximal allowed distribution time. The SCRMTSP aims to route each order \(c\in C\) from o(c) to d(c), by composing the corresponding freight cars to trains with specific departure and arrival times, while constructing a routing matrix with pairwise unique paths between all \(s_1,s_2 \in S\). The objective is to minimize the overall cost, while simultaneously ensuring that all capacity and routing constraints are satisfied and every order is delivered on time.

1.4 Contribution

  • We introduce the SCRMTSP and an associated exact solution approach. While respecting all constraints and cost factors as stated by RCA, our ILP formulation is able to generate a routing matrix and a train schedule simultaneously based on a given time discretization.

  • We perform an extensive computational study based on real-world instances derived from RCA. We consider the computational performance of the ILP and analyze several key performance indicators (KPIs) such as utilization of trains, waiting times and the number of shunting processes.

2 Literature review

Concerning different routing strategies in railway freight transportation, a differentiation between European and North American approaches can be made, based on the existing literature. Whereas the already mentioned routing of single freight cars is commonly used in Europe, the concept of railway blocking is applied especially in North America. A comparison between North American and European railway systems can be found in Clausen and Voll (2013). For all modes, single freight car routing (with and without routing matrix) as well as railway blocking approaches, research has been done over the past years. In some studies a certain focus is put on obtaining routing and train scheduling decisions simultaneously. Relating to these joint problems, Assad (1980) was one of the first researchers considering both aspects at once in his work.

Considering the mentioned differences, this literature review is structured as follows. The first part takes a closer look on railroad blocking problems. The second part reviews relevant literature referring to single freight car routing. Finally, the last part considers studies dealing with the joint freight car routing and train scheduling problem. Both routing strategies, railway blocking as well as single freight car routing in combination with train scheduling are summarized. For a general survey on optimization models for train routing and scheduling we refer to Cordeau et al. (1998).

The railway blocking problem (RBP) deals with the assignment of freight cars of individual shipments to blocks of freight cars. An order comprises of a certain amount of freight cars and may pass through several shunting yards on its route from origin to destination (O–D). Since classifications of cars in shunting yards can cause major delays and have a significant impact on costs, railway companies aim to reduce their overall shunting processes. This is done by forming blocks with unique O–D pairs, consisting of groups of individual shipments. The O–D pair of a consignment does not have to match the O–D pair of the block it is traveling with, but it can. Once an individual shipment is placed in a block, it will not be re-classified until it reaches the destination of the block. By solving the RBP, an optimal mix of shipments traveling via direct blocks and shipments traveling via several blocks on the way to their destination is found, such that the overall operational cost is minimized. Restrictions are usually applied on the capacity of the blocks, the maximal number of blocks to be built in shunting yards and the maximal number of freight cars to be processed in a shunting yard per time period. The number of trains needed for transporting the blocks through the network is usually not of consideration though. For previous work done in the field of RBP we refer to Barnhart et al. (2000), Yaghini et al. (2012), Ahuja et al. (2007), Bodin et al. (1980) and Newton et al. (1998). Solutions for a combination of RBP and train routing decisions are proposed by Keaton (1989) and Crainic et al. (1984). Their models help to determine an optimal blocking plan for freight cars by simultaneously deciding on possible train connections between nodes in the network, as well as the frequencies of train connections. However, the scheduling of trains is not part of their studies.

For approaches considering single-freight car routing, we mainly refer to the model proposed by Fügenschuh et al. (2013) for a German railway provider, as it is the only previous work using the concept of a routing matrix. They model the problem as multi-commodity flow problem, deciding on routes of the freight cars and the number of direct trains traveling between shunting yards, needed for transportation. The waiting times of freight cars at shunting yards are firstly modeled as constant parameter. Later, the model is refined by including waiting times as a function of maximal possible waiting times for cars in shunting yards divided by the number of trains leaving from the yard. Besides the restrictions on length and weight capacities of trains, shunting yard capacities and maximally allowed travel times of freight cars, they introduce so-called unique successor constraints. By the help of these constraints they force orders with the same destination meeting somewhere in the network, to travel together via the same path on their remaining routes. As a result, the optimized routing matrix for their network is obtained. However, in contrast to their work, we additionally consider the scheduling of trains in this work. While the difficulty of the problem increases significantly due to the time component in the network, considering the SCRP and the FTSP in a jointly manner allows us to model the problem in a more realistic way. As mentioned in Sect. 1.2, waiting times of freight cars do not only depend on the duration of shunting processes, but are also highly influenced by train departure times. Thus, in order to take this fact into account, train schedules should be of consideration when obtaining routing plans for freight cars. Since all of the necessary constraints, needed for solving the specific SCRP under the concept of a routing matrix, are already considered in the approach as presented in Fügenschuh et al. (2013), the mathematical model is reformulated and extended in this work. By doing so, it is possible to apply the formulation to our time-space network presented in Sect. 3 and thus, solve the SCRMTSP.

Zhu et al. (2009), Gorman (1998), Huntley et al. (1995) and Ceselli et al. (2008) propose models for solving the Single-Freight Car Routing (SCRP) and Train Scheduling Problem (FTSP) jointly. Whereas in Zhu et al. (2009) the joint problem is solved as RBP and FTSP, in Huntley et al. (1995) the authors aggregate the demand with the same origin and destination into batches of freight cars with defined ready times as a routing strategy. By the help of the ready times they are able to route the batches over timed routes, describing the sequences of trains used. For each train the model decides on its departure and arrival time. In Gorman (1998) and Ceselli et al. (2008) the single-freight car routing strategy is applied. Gorman (1998) decomposes the problem into an individual SCRP and FTSP and solves them by the help of a genetic algorithm and a tabu search heuristic. Ceselli et al. (2008) present different models to solve the SCRP and FTSP simultaneously for the Swiss Federal Railways. Due to their very specific network topology, freight cars are collected from different stations and either sent via direct trains to their destination, or to one of the two hubs in the network. In the hubs, the freight cars are bundled to new outbound trains and either sent to their destinations or to the other hub in the network. In all of the mentioned studies where the single-freight car routing strategy is applied, freight cars can take individual routes through the network. Therefore, none of the existing approaches is suitable for our specific problem statement since we want to route our freight cars by the help of a routing matrix. Extending these existing approaches by introducing routing matrix constraints might not be possible without changing their entire mathematical models and thus, we decided to reformulate and extend the model for the SCRP from Fügenschuh et al. (2013) where these constraints are already included.

3 Mathematical model

Time-space network We model the primary network of RCA on a time-space network. In the following we describe the construction of a directed graph and refer to Fig. 4 for an illustrative example.

Fig. 4
figure 4

Time-Space Network \(\lambda = 1\): The nodes display three different shunting yards A, B and C. The small nodes illustrate corresponding shunting yard nodes for each time period, the big nodes illustrate sink nodes. Shunting Yard A is connected with shunting yard B, drive time = 2 time periods. Shunting Yard C is connected with shunting yard B, drive time = 1 time period. Illustration of arc types: \(A_1\) = dashed arrows, \(A_2\) = lined arrows, \(A_3\) = dotted arrows

For determining departure and arrival times of trains, we discretize the planning horizon, which leads to a certain number of different points in time that can be used for departure and arrival in every station. The discretization is done by a discretization parameter \(\lambda \in {\mathbb {N}}\), which displays the constant interval size in hours between the points in time. From a practical point of view, freight trains do not have to be planned exact to the minute as it is done for passenger trains, for example. An hourly planning basis is fully sufficient. However, everything longer than 2 h might shrink the solution space too much and thus makes the planning too inaccurate. We define \(\Lambda :=\{0,\lambda ,2\lambda ,\ldots ,k\lambda \}\), where \(k \in {\mathbb {N}}\) and \(k\lambda \) is the end of the considered planning horizon. Note that the larger \(\lambda \), the smaller the network. For illustration, let us consider a planning horizon of 24 h. If we choose \(\lambda =1\), we obtain the discretized time horizon \(\Lambda =\{0,1,\ldots ,23,24\}\). If we choose \(\lambda =2\), it results in \(\Lambda = \{0,2,\ldots ,22,24\}\). Looking at the actual travel times between two shunting yards, this means that each travel duration is rounded up to the next full hour occurring in \(\Lambda \). Thus, each travel duration in the resulting network is a multiple of \(\lambda \). Considering the shunting duration, each shunting yard \(s \in S\) has a predefined duration \(d_s \in \Lambda \) that is needed for shunting a freight car.

There are two types of nodes in the network: \(V:=V^{t}\cup V^{s}\). For each shunting yard, we introduce a node for each point in time in \(\Lambda \), and collect them to the set of shunting nodes \(V^t\) . Thus, each \(i \in V^t\) corresponds to a shunting yard \(s \in S\) and a point in time \(t_i \in \Lambda \). The cost for shunting a single freight car in a node \(i \in V^t\), are denoted by \(\sigma _i \in {\mathbb {R}}^+\). Shunting yards have a limited capacity. We denote the maximal allowed number of freight cars that can simultaneously be located in \(i\in V^t\) by \({\mathcal {C}}_{i} \in {\mathbb {N}}\). Further, for each shunting yard \(s \in S\), we introduce a sink node and collect them to the set of sink nodes \(V^s\). The shunting yard associated to a node is determined by the function \(\overline{s} : V \rightarrow S\).

There are three types of arcs in the network: \(A:=A_1\cup A_2 \cup A_3\). Arcs in \(A_1 \in V^t \times V^t\) connect nodes in \(V^t\) with each other and are called driving arcs. We introduce a driving arc from node \(i\in V^t\) to a node \(j\in V^t\), associated to shunting yards \(\overline{s}(i) = s \in S\) and \(\overline{s}(j) = s^\prime \in S\), if and only if \(s \not = s^\prime \) and the time of \(i\in V^t\) plus the travel time between s and \(s^\prime \) equals the time of \(j\in V^t\). Arcs in \(A_1\) are weighted by \(\ell _{ij} \in \mathbb {R^+}\), the distance between the respective shunting yards in the network. Arcs in \(A_2 \in V^t \times V^t\) act as waiting tracks in shunting yards and are called waiting arcs. We introduce an arc (ij), \(i\in V^t\), \(j\in V^t\), if and only if \(\overline{s}(i) = \overline{s}(j)\) and \(t_i+\lambda =t_j\), i.e., waiting arcs connect timely consecutive nodes within a shunting yard. Finally, arcs in \(A_3 \in V^t \times V^s\) connect all nodes from \(V^t\) to their associated sink node in \(V^s\) and are called final arcs.

Orders The set of customer orders is defined by \(C :=\{1,2, \ldots , c, \ldots , p\}\). For an order \(c \in C\), the related number of freight cars is denoted by \(n_{c} \in {\mathbb {N}}\), the total weight by \(W_{c} \in {\mathbb {R}}^+\), and its total length by \(L_{c} \in {\mathbb {R}}^+\). Note that the origin and destination, associated to an order \(c\in C\), are unique. They are determined by functions \(o : C \rightarrow V^t\) for the origin and \(d : C \rightarrow V^s\) for the destination. The ready time of an order c is denoted as \(r_c \in \Lambda \). It defines the time at which order c is ready for distribution in its origin o(c). The maximally allowed distribution time of an order after its ready time is denoted by \({\mathcal {T}}_c\in \Lambda \).

Trains and freight cars For composing freight cars to trains, we consider restrictions to the maximal length and the maximal weight for a train driving along an arc. For an arc \((i,j) \in A_1\), the maximal length for a train is denoted by \({\mathcal {L}}_{ij} \in {\mathbb {R}}^+\) and the maximal weight of a train is denoted by \({\mathcal {W}}_{ij} \in {\mathbb {R}}^+\). Further, parameter \(\kappa \) denotes the cost for a train traveling one kilometer. Finally, parameter \(\eta \) gives the holding cost for one freight car per time unit.

3.1 Integer linear programming formulation

Based on the time-space network as defined above, we propose the following ILP formulation for solving the SCRMTSP. All sets and parameters needed for the formulation are summarized in Table 2.

Table 2 Sets, functions and parameters

In the following, we introduce four types of binary decision variables:

$$\begin{aligned}&x_{ijc} \in \{0,1\},\ c\in C,\ (i,j)\in A,&x_{ijc}= {\left\{ \begin{array}{ll} 1,\text { if order}\, c \,\text {travels on arc}\, (i,j) \\ 0, \text { else.} \end{array}\right. } \\&y_{ij} \in \{0,1\},\ (i,j)\in A_1,&y_{ij}= {\left\{ \begin{array}{ll} 1,\text { if arc}\, (i,j)\, \text {is used by any train} \\ 0, \text { else.} \end{array}\right. } \\&z_{sc} \in \{0,1\},\ c\in C,\ s \in S,&z_{sc}= {\left\{ \begin{array}{ll} 1,\text { if order}\, c\, \text {visits shunting yard}\, s \\ 0, \text { else.} \end{array}\right. } \\&q_{stc} \in \{0,1\},\ c\in C,\ s, t \in S, s \not = t,&q_{stc}= {\left\{ \begin{array}{ll} 1,\text { if order}\, c\, \text {travels from} s \in S \,\text {to}\, t \in S \\ 0, \text { else.} \end{array}\right. } \end{aligned}$$

Please note that we do not generate variables, which obviously violate any time or location restrictions. For example, we would not set up \(x_{ijc}\) if \(r_c \ge t_i\). Moreover, we would not set up this variable if \((i,j)\in A_3\) and \(j \ne d(c)\), or entering j at \(t_j\) would exceed the maximal distribution time \({\mathcal {T}}_c\). The ILP for solving the SCRMTSP reads now as follows:

figure a
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
$$\begin{aligned}&x_{ijc}&\in \{0,1\},&\forall \, (i,j)\in A,\,\forall \, c \in C \\&y_{ij}&\in \{0,1\},&\forall \,(i,j)\in A_1 \\&z_{sc}&\in \{0,1\},&\forall \,s\in S,\, \forall \,c \in C \\&q_{stc}&\in \{0,1\},&\forall \,s,t \in S: s \not = t, \forall \,c \in C \end{aligned}$$

Objective The objective consists of three cost factors. The first term accounts for the total cost for train kilometers in the network, which is calculated by the total length of the tracks, weighted by the respective cost per train kilometer. The two remaining terms are linked to freight car expenses. The second term corresponds to the total holding cost for the travel duration for shipping all freight cars of all orders from their origins to their destinations. Finally, the third term accounts for the shunting cost that occur for handling freight cars at shunting yards.

Constraints Constraints (1) to (3) state the flow conserving properties. By (1) and (2), it is ensured that each order leaves the origin and reaches its destination exactly once. Further, (3) guarantees that an order reaches and leaves a node, different from its origin and destination, at most once. We add Constraints (4) for strengthening the model, i.e. if an order uses a waiting arc it enters the node j via a driving arc at most once. Constraints (5) and (6) ensure unique paths between two shunting yards, i.e., define the resulting routing matrix. Conditions (5) state that if order c travels from shunting yard s to shunting yard t, the order must travel between two nodes i and j with corresponding shunting yards s and t. If two different orders c and e with the same final destination d(c) = d(e) meet at a certain shunting yard s, both orders have to follow the same path as defined by Conditions (6) . Constraints (7) are required to keep track if order c visits a shunting yard s on its route. Shunting times are considered by Constraints (8). It is ensured that order c spends at least the time required to perform the shunting process in shunting yard s. We add Constraints (9) to get better bounds for the model, any \(x_{ijc}\) of our set of driving arcs \(A_1\) can be at most \(y_{ij}\). The capacity limitations for arcs and nodes are defined by Constraints (10), (11), and (12). Equation (10) states that the total length of all orders sent on a driving arc \((i,j) \in A_1\) must not exceed the maximal allowed length. The same is achieved with Equation (11) in terms of the maximal allowed weight. Finally, Equation (12) ensures that the sum of all freight cars arriving in node \(j \in V^t\), plus the ones already waiting there, does not exceed the capacity of the node.

The solution of the ILP can be interpreted as follows. Each variable \(y_{ij}\) with solution value 1 corresponds to a route section used by a train. The corresponding times in nodes i and j form the optimal departure and arrival times of the train. To get all orders (freight cars) that travel on arc (ij), all related variables \(x_{ijc}, \, \forall c \in C\), with solution value 1 have to be collected. The routing matrix can be read from the \(q_{stc}\) variables, as they describe directed arcs used by an order c. For illustration consider Example 2

Note that not every instance creates a routing matrix for the whole network, as only shunting yards corresponding to involved orders are used. The more orders with different shunting yards are included in an instance, the more entries the routing matrix contains. In practice, the routing matrix should be recreated several times a year. Generally speaking, every time a significant change in the transportation demand’s structure occurs. However, a more frequently re-planning of the routing matrix guarantees more flexibility. Therefore, e.g. a weekly update of the routing matrix will lead to a reduction of overall cost, since the network structure is always coordinated to the actual demand structure.

Example 2

(Routing Matrix, follow up) The routing matrix can be read from the \(q_{stc}\) variables, as they describe directed arcs used by an order c. For illustration consider Example 1 and the corresponding routing matrix in Fig. 3. For order A, we obtain \(q_{ViennaLinzA}=1\), thus, the corresponding entry in the routing matrix is set to D. For order C, we obtain \(q_{ViennaGrazC}=1\) and \(q_{GrazSalzburgC}=1\). Thus, a train from Vienna to Salzburg must travel via Graz. \(\lozenge \)

4 Experimental evaluation

In this section, we present results obtained through an extensive computational evaluation on the proposed ILP as stated in Sect. 3.1. The approach was implemented in ASP.NET Core (C#) and Gurobi 9.1.2 solver was used for solving the ILP. All experiments were executed on a Windows 64-bit machine, equipped with Intel(R) Core(TM) i5-8365U CPU @1.60GHz, 4 cores and 16 GB RAM. The time limit for solving the ILP was set to 2 h.

Problem instances All benchmark instancesFootnote 1 are generated based on real-world data provided by the RCA. For discretization of the driving arcs in \(A_1\), the rounded up average driving times between two nodes were used. Thus, if the average driving time between two certain nodes takes 40 minutes in reality, for \(\lambda = 1\) an arc with driving time of one hour and for \(\lambda = 2\) an arc with driving time two hours is constructed. The same logic was applied for the shunting times. For instances where the same discretization parameter, as well as the same length regarding the time horizon is considered, the size of the problem graph remains the same. In such case, the size of the problem instance can be seen in relation to the total number of orders to be routed through the network. For all of the orders, real data regarding the number of freight cars, the length and weight, as well as the origin and destination shunting yard was used. It has to be mentioned that the actual ready times were adapted in such way that a feasible solution could be obtained for all orders. The maximal travel time parameter \({\mathcal {T}}_c\) was set to 24 hours for all orders. Thus, \({\mathcal {T}}_c\) only affects routing decisions for instances with more than 24 hours time horizon. Real cost factors were used as weights for the target function. Due to reasons of concealment, the cost factors can not be stated in the provided instance data1.

Variables In Table 4 we give an overview of the proportions of the various sets of variables. We state the network, the number of orders, the number of constraints and the total number of variables. Besides that, we divide the variables according to Sect. 3.1 and state the numbers of the respective sets, as shown again in Table 3 for reasons of convenience.

Table 3 Abbreviations for Table 4
Table 4 Details on variables: time horizon \(\in \{24\,\text {h}, 48\,\text {h}, 72\,\text {h}\}\), \(\lambda \in \{1,2\}\)

The first observation concerns the impact of the chosen discretization on the number of \(x_{ijc}\) and \(y_{ij}\) variables. Comparing \(\lambda =1\) versus \(\lambda =2\), the numbers almost halve. Similar to the number of variables, also the number of constraints is related to the degree of the discretization parameter. Thus, a decrease of \(32\%\) for the smallest instance and \(14\%\) for the largest instance can be observed when comparing the number of constraints for \(\lambda = 1\) and \(\lambda = 2\). Increasing the time horizon leads to a strong increase of the number of nodes and arcs of the corresponding discretized time-space network. Thus, also a strong rise in the number of variables as well as in the number of constraints can be observed. The \(x_{ijc}\) variables clearly dominate in all cases as they cover all of the three different sets of arcs in the network. The share of \(x_{ijc}\) variables compared to the overall number of variables amounts to \(88\%\) on average over all instances1.

As \(y_{ij}\) variables are only generated for the set of driving arcs \(A_1\), their share in regards to the overall sum of variables only amounts to \(1.2\%\) on average. In contrast to \(x_{ijc}\) and \(y_{ij}\), \(q_{stc}\) and \(z_{sc}\) only depend on the number of shunting yards in the network, as well as on the number of orders to be transported. Thus, they are not directly affected by the chosen discretization parameter. Whereas the share of \(z_{sc}\) variables compared to the number of overall variables only accounts for \(0.8\%\), the number of \(q_{stc}\) variables accounts for almost \(10\%\) on average.

4.1 Performance analysis

In Table 5, we state abbreviations used in Tables 6 and 7 in the current section. Table 6 shows details about the performance when applying the proposed ILP on instances comprising of at least 20 and up to 300 orders.

Table 5 Abbreviations for Tables 6 and 7
Table 6 ILP Performance Details: time horizon 24, 48 and 72 h, \(\lambda \in \{1,2\}\)

The setting of the discretization parameter has a big impact on the model size and thus, on the performance of the ILP. On Network H24\(\lambda \)1 it was possible to solve instances with up to 120 orders to optimality, whereas instances up to 200 orders could be solved to optimality on Network H24\(\lambda \)2. Considering the solving times of the ILP on the networks with a time horizon of 24 h, a drastic increase can be observed for instances with more than 80 orders for \(\lambda = 1\) and instances with more than 160 orders for \(\lambda = 2\). While the optimality gaps for the largest instances solved with \(\lambda = 2\) remain quite low, gaps up to \(9.75\%\) can be observed for the largest instances with \(\lambda = 1\). Considering a time horizon of 48 h, we provide results for up to 180 orders. For larger instances, the solver was not able to yield any feasible solution within the time limit of two hours. Only 20 orders could be solved to optimality on Network H48\(\lambda \)1 and up to 60 orders on Network H48\(\lambda \)2. The gaps on Network H48\(\lambda \)1 grow significantly and reach almost \(100\%\) at 180 orders. On Network H72\(\lambda \)1, the solver yields feasible solutions up to 60 orders, and up to 140 orders on Network H72\(\lambda \)2. As well as on Network H48\(\lambda \)1, only the instance comprising 20 orders could be solved to optimality on H72\(\lambda \)1. Especially the gaps grow quickly and lead to worse results than for \(\lambda = 2\).

Table 7 ILP KPIs: time horizon 24, 48 and 72 h, \(\lambda \in \{1,2\}\)

Table 7 shows details regarding different KPIs, such as the objective values, number of trains and train kilometers, for all instances that have already been considered in Table 6. We go into more detail concerning the objective values of instances where optimal solutions could be obtained for both discretization parameters. It turns out that on Network H24\(\lambda \)2, the optima worsen at least by \(8\%\), at most by \(14,4\%\) and on average by \(10,3\%\) compared to the results on Network H24\(\lambda \)1. Taking into consideration that train cost depict the largest cost factor in the objective function, the increase can be mainly addressed to the growing number of train kilometers when using \(\lambda = 2\) as discretization parameter.

A special attention has to be paid to the instances comprising 80 orders. In that case, a lower number of trains is needed for the instance where \(\lambda = 2\) is used. However, the number of driven train kilometers is larger instead, compared to the instance with \(\lambda = 1\). This can be explained by the fact that a different routing matrix was obtained and therefore, different arcs of the set of driving arcs \(A_1\) are used. Thus, certain trains do have to travel a longer distance which results in a higher number of train kilometers in turn.

Comparing the numbers of trains, driven train kilometers and the objective value of instances with different time horizons against each other, it can be observed that less trains are needed for longer time horizons. This can be mainly reasoned by the fact that it is easier to consolidate freight cars at certain shunting yards, e.g., freight cars can spend a longer time waiting for a train connection. This of course will lead to an increase of overall waiting times and therefore, also of the overall holding cost. Instances on a time horizon of 48 h or 72 h are affected by the maximal travel time parameter \({\mathcal {T}}_c\). However, a better objective value can be obtained since driven train kilometers can be decreased. We hereby refer to Sects. 4.2 and 4.3 where we take a closer look on train utilizations, waiting times and shunting processes.

4.2 Utilization of train capacities

In this section the utilization of the trains is examined in detail. For this purpose, we additionally indicate the total weight in tons (Weight t) and the total length in meters (Length m) of all orders for each instance. In Table 8 we state further abbreviations to characterize train utilizations used in Table 9.

Table 8 Abbreviations for Table 9

Note that only results for instances where an optimal solution could be obtained are stated in the tables. This is necessary in order to be able to compare the results against each other.

Table 9 Train utilization details: time horizons 24, 48 and 72 h, \(\lambda \in \{1,2\}\), length capacity of 400 m and weight capacity of 1600 t

When examining the time horizon of 24 h in Table 9, it can be observed that the average train utilizations, both for length as well as for weight, correlates positively with the number of orders to be transported. This can be explained by the fact that train cost depict the largest cost factor in the objective function. Hence, the ILP tries to minimize the number of used trains, which in turn results of course in a higher utilization rate of the same. Another interesting observation can be made when comparing the average utilization rates of each instance amongst each other. The average length utilization is higher for each instance compared to the average weight utilization. This can be explained by the fact that orders can consist of empty as well as of loaded freight cars. Thus, when transporting orders comprising empty as well as loaded freight cars within one train, the capacities are usually rather restricted by length than by weight. This aligns with the information we received by experts from RCA. Comparing the results of instances for both discretization parameters against each other, it can be seen that for all instances where a lower number of trains is needed, the average utilization rates increase. This could be reasoned by the fact that less weight as well as length train capacity is available. However, utilization rates are also strongly influenced by the generated routing matrix.

For the instances with time horizons of 48 h and 72 h we obtain similar findings as described above, the larger the number of orders to be transported, the higher the average utilization rates become. However, when considering instances with 20 and 40 orders for \(\lambda = 2\) and a time horizon of 72 h, it can be seen that the average length utilization rate slightly decreases. One reason could be the structure of the instance (e.g. the ready times of the orders) combined with a maximum allowed travel time of 24 h.

As mentioned in Sect. 4.1, in general, fewer trains are needed for instances with a larger time horizon. However, a time horizon of 72 h leaves enough space for a broad distribution of ready times of the orders and thus, it could be possible that it gets harder to consolidate orders at certain shunting yards in the network. This leads to a higher number of trains and a lower utilization rate. This assumption could be strengthened when comparing the instances with 40 orders and \(\lambda = 2\) for the time horizons 48 h and 72 h. In this case, more trains are needed for the instance with a time horizon of 72 h. However, as already mentioned in Sect. 4.1, in general fewer trains are needed for instances with a larger time horizon. In turn, this leads to an increase of the average capacity utilization rates, which can be observed when comparing the results with \(\lambda =1\) and \(\lambda =2\) of Table 9 against each other.

4.3 Shunting processes and waiting times

In this section we take a closer look at the number of shunting processes as well as on the waiting times of orders. In Table 10 we provide a description of further abbreviations occurring in Table 11.

Table 10 Abbreviations for Table 11
Table 11 Shunting details: time horizons 24, 48 and 72 h, \(\lambda \in \{1,2\}\)

Table 11 shows the results for instances with \(\lambda = 1\) and \(\lambda = 2\) for time horizons of 24, 48 and 72 h. Considering the time horizon of 24 h, for any instance, except of the one that comprises 20 orders, each of the shunting yards in the network is visited. On the one hand, this could be reasoned by the structure of the problem instances. Especially for instances with a higher number of orders to be transported, all shunting yards might have to be visited at least once since they can depict either the origin or destination node of an order. On the other hand, also the structure of the obtained routing matrix plays an important role hereby. The more complex the routing matrix, e.g. the less direct connections between two shunting yards exist, the more shunting yards will be visited. For both settings of the discretization parameter, the maximum, minimum and average number of shunting processes per order remains the same, except for one instance. Moreover, the total waiting times show a positive correlation with the number or orders. Again, this observation can be easily explained by the structure of the instances. As mentioned above, the higher the number of orders to be transported, the more yards have to be visited and thus, the higher the overall waiting times become. This can be linked to the fact that an order has to wait at least the duration of a shunting process at a visited yard. This can also be seen when examining the number of visited yard nodes. A higher number of waiting times means that a higher number of waiting arcs in \(A_2\) are used and thus, a higher number of yard nodes is visited. When comparing the results of the instances for \(\lambda = 1\) and \(\lambda = 2\) against each other, one can see that the number of shunting processes, as well as the total waiting times, show lower results for the instances with discretization parameter of 2 time units. Following the observations as described in Sect. 4.1, usually more trains are used for the instances with \(\lambda = 2\). Thus, it is more likely that the routing matrix comprises more direct links between two shunting yards. This in turn means, that the average number of trains used by an order decreases, resulting in a lower number of shunting processes needed and therefore, in a lower waiting time. These findings align with the results of instances on time horizons of 48 h and 72 h. A longer time horizon leads to a decrease in the number of needed trains, since it is easier to consolidate freight cars at certain shunting yards. Thus, the routing matrix comprises fewer direct train links between shunting yards. As a result, orders tend to visit more shunting yards on the route from their origins to their destinations. Therefore, more shunting processes have to be carried out, which in turn affects the overall waiting times in the network.

5 Conclusion and future work

In this paper we introduce the SCRMTSP as combination of the SCRMP and the FTSP. When solving this problem, two main aspects of freight railway optimization are considered jointly. First, the routing of single freight cars while following the concept of a routing matrix. Second, the generation of a freight train schedule. An ILP formulation, based on a time-space network, is presented for solving the SCRMTSP. To the best of our knowledge, this is the first approach, capable of solving the Single-Freight Car Routing Problem and the Train Scheduling Problem simultaneously, while additionally generating a routing matrix. An extensive computational study is conducted, based on real-world instances derived from our business partner RCA. The performance of the model is tested by varying the size of the problem graph, as well as the number of orders to be routed through the network. Furthermore, operational KPIs, such as the utilization of train capacities, waiting times of freight cars, or number of shunting processes are analyzed. It turns out, that even for our smallest problem graph, it is only possible to solve instances up to around 200 orders to optimality within two hours. This can be mainly addressed to the exploding number of decision variables as well as the complex structure of the constraints used for constructing the routing matrix.

However, during a standard day, the transport volume in RCA ’s single freight car transportation network can easily reach up to 1500 orders. Furthermore, following the observations as stated in Sect. 4, better results, in terms of the objective value, can be obtained when increasing the length of the time horizon. Thus, it is clear that a faster solution approach is needed in order to solve an entire real-world instance at once. This could be achieved by developing a suitable heuristic. A potential starting point could be the generation of the routing matrix before executing the optimization step. Considering a single shunting yard as destination station for orders, having their origin in any of the other shunting yards, the network follows the structure of a rooted in-tree (anti-arborescence). Thus constructing in-trees for each destination station and applying local search steps when combining the single in-trees in order to obtain a total routing matrix, could help simplifying the problem. One could also think of restricting train departure times and operating hours of shunting yards to certain time windows per day. In fact, in times where a strong passenger demand can be expected, some tracks are usually reserved for passenger trains only. In addition, especially for smaller shunting yards, shunting processes are often limited to certain operating hours per day. Therefore, it would be possible to remove several arcs from the problem graph and shrink its total size. As a result, it should be possible to solve problem instances with a higher number of orders, while modeling the problem in a more realistic way.

For future works it would also be interesting to solve the SCRMP and the FTSP separately and compare the results to those of the SCRMTSP. The performance of solving the SCRMTSP might be worse, however, it can be expected that the objective values of the SCRMTSP are better compared to those of the separated approach. This can be explained by the fact that the combined approach is able to find a global optimum while the separated approach can only obtain two local optima. Therefore, we suggest using suitable heuristics as mentioned above instead of solving the problem separately.