Optimisation of seat reservations on trains to minimise transfer distances

This paper introduces a novel optimisation problem motivated by the real-world application of transferring between trains. The driving idea is to minimise the walking distances of all transferring passengers by optimising the assignment of passengers to railcarriages in the seat reservation process. The focus of this work is on formalising and modelling the problem, which has not been studied before. It aims to provide a framework for future considerations. We present three versions of the problem with increasing difficulty: one-train-one-station, one-train-many-stations, and a general problem with multiple trains and stations. Since the simplest version of the problem is a pure assignment problem in the form of minimum-cost bipartite matching, our problem modelling and algorithmic solution approach remain closely related to maxflow problems as they represent a proven method for assignment problems. However, even the one-train-many-stations problem cannot be solved by transforming it into a standard maximum-flow problem. The main result shown is the NP-hardness of the general problem.


Introduction
The operation of public transport depends on diverse, often operation-critical optimisation problems with different planning scopes (Huisman et al. 2005)-from strategic goals such as line planning to short-term goals such as crew scheduling (Neufeld et al. 1 3 49 Page 2 of 31 2021). For an individual traveller, many framework conditions such as line planning cannot be influenced. All the more important are the optimisation problems which affect the solution of his individual journey within the given framework conditions. These optimisation problems include the search for train connections (Pyrga et al. 2007), which, in addition to a pure search for the fastest connection, also takes into account other factors such as the reliability of a connection or the walking distances to be covered in urban public transport (Disser et al. 2008). Seat reservation is also one of the customer-oriented optimisation problems, although up to now this has primarily been considered for group reservations as a variant of the 2-dimensional knapsack problem (Clausen et al. 2010).
This paper approaches the seat reservation process from a different angle. In order to increase travel comfort, passengers' seat reservations should be made in such a way that individual walking distances are as short as possible. This applies in particular to transfers at stations where, due to clumsy reservations in two trains, the entire platform has to be walked although both trains stop there. When transferring to a train at the same platform, the next reservation should be close to the position where you left the train before. When transferring to a train at another platform, or even at the beginning and end of the journey, the reserved seats should be close to the platform access. The objective of this work is to develop a mathematical model of the application problem and to classify the optimisation problems in terms of problem complexity. To our knowledge, this problem has not yet been formally addressed.
In the long term, this should answer questions as to whether there are suitable heuristics that can be used to efficiently find an acceptable approximate solution offline. Based on this, an experimental study can give an answer as to how big a difference between such a reservation system and a random placement of passengers is. Last but not least, an optimised reservation system could also have an impact on the operation of the railway, since transfer times between connections and dwell times of trains at the station can be planned differently.
These considerations are largely driven by the future vision that tickets will not only be used digitally via smartphone, but that the seats reserved will be determined shortly before the journey. In particular, this allows us to take into account situations such as trains arriving in reverse, missing railcarriages, and replacement trains. An online adjustment of reservations during operation is out of the scope of this work.
In the following section, the newly considered problem is framed by a literature review. Section 4 introduces the basic concepts of the railway infrastructure. The following sections define the optimisation problem with increasing problem difficulty: Sect. 5 considers one train in one station, Sect. 6 one complete train, and Sect. 7 a complete interconnection network. The latter problem is then analysed in Sect. 8 with regard to its computational complexity. The final Sect. 9 summarises the findings.

Related work
Seat reservation on trains has been the subject of various studies and publications in the past, which we summarise in this section. In addition, we also discuss some research that is closely related to the results presented here.
Online seat reservations. The oldest publications on seat reservations take an economic perspective and examine online reservation algorithms where not all requests are yet complete when seats are allocated. Those publications examine the degree to which online reservation algorithms can fill trains to capacity (Boyar and Larsen 1999;Bach et al. 2000;Miyazaki and Okamoto 2010), the so-called accommodating ratio. Two pricing schemes are compared: a fixed flat price and costs that are proportional to the length of the journey. In short, these works show that a significant proportion of seats cannot be allocated by an online algorithm because each decision is made with incomplete information. This effect can be countered by certain techniques, e.g. reserving only partial sections or by requiring passengers to change seats on a train (Solanki and Patidar 2019).
Offline group seat reservations. The conceptual counterpart is the offline allocation of seat reservations, where all reservation requests are known. From an algorithmic point of view, this is hardly interesting for individual reservations. Therefore, the research here deals with group reservations, which make it difficult to allocate all seats. Clausen et al. (2010) transform the problem into a two-dimensional knapsack problem or two-dimensional bin packing and investigate branch-and-bound algorithms with different bounds. Deplano et al. (2019) have extended the problem by attaching additional profits to various properties of the seats.
Offline seating arrangements. While the train utilisation is significantly better with an offline reservation, the business model of a railway company suggests an online algorithm. Kohrt and Larsen (2005) address this conflict by separating the mere decision as to whether a reservation can be fulfilled (as an online problem) from the actual allocation of seats (as an offline problem). They present a data structure in which the insertion and deletion of a reservation is possible in O(log p) (where p is the number of stations). This concept is an essential basis for the problem presented in our article, which attempts to achieve greater customer-friendliness when allocating seats offline in the second step.
Dwell times of passengers and the transfer coordination design problem. When approaches are presented in the literature to make a passenger's journey more comfortable, it is mostly about adjustments to departure schedules and better coordination of the different trains. Liu et al. (2021) present an extensive survey article on this type of problem, which mostly has minimising transfer waiting time, minimising total cost, or maximising of the number of successful transfers as the optimisation objective. Since these considerations concern mass transportation, they are only of marginal importance for our problem. Mostly, cycle times, frequencies, or departure times are optimised, e.g. Jansen et al. (2002) minimise the weighted sum of transfer times with a tabu search heuristic and Nieuwenhuis (2021) uses a metaheuristic to adjust the departure times by few minutes achieving a 5.7% reduction in transfer times. A few works also consider passenger assignment as a decision variable, but this refers to the choice between different possible connections or paths and not to seat reservations (Zhou et al. 2019;Chu et al. 2019).
Modelling of connecting trains. In the reservation problems from the literature, transfer processes have not been taken into account so far, which is why the associated problem models do not have the concept of the connecting train. Therefore, we will now take a brief look at the models used in timetable query. Pyrga et al. (2007) present two different ways of introducing time-dependencies into the graph representing the rail network and train connections. The time-dependent model contains only one node for each station and, for many queries, is superior to the timeexpanded model, which duplicates the station nodes in the graph for the different departure times. To make the time-expanded model more competitive, Delling et al. (2009) simplified the original model by representing unimportant stations differently. As part of our work, we decided not to store time in the graph or model itself, but to consider time-dependencies externally when generating passenger travel plans.
Transfer cost. Passenger transfer costs beyond the dwell time between two connecting trains are rarely considered in the existing literature. An example of this is work by Li et al. (2021) in the context of optimised coordination of full-length and short-turn trains on an urban train line. The transfer costs are levied rather coarsely on the basis of the number of transfers.

Problem description
In this section, the optimisation problem is introduced informally and all assumptions concerning the modelling of the optimisation problem are summarised.

Transfer cost
In this paper we assume that the routing of the trains, the train schedule and the trains used are invariant constant quantities. Likewise, the set of passengers with their individual travel plans, i.e. the assignment to a sequence of trains, is an input variable for our problem.
This leaves the seat reservations for passengers as the only decision variables within this optimisation problem. The objective variable is the length of the travellers' walking distances when transferring between the trains as well as when boarding and alighting the train at the beginning and at the end of the journey. The distances travelled by all passengers should be kept as short as possible, which is why square distances are used in the rest of the paper.
This problem is introduced in three variants with increasing difficulty. First, only boarding a train at a station is considered. In the second variant, the course of a train is modelled along the stations of its routing, the passengers stay on the train for an individual section at a time and they board the train and leave the train at a specific position on the platform in each case. In the last variant, a complete network of trains is considered and the passengers' journey can extend over several trains.

Assumptions for the optimisation problem
This section summarises the constraints and assumptions that we use to keep the model of the optimisation problem manageable enough for the scope of this paper.
The focus of the work exclusively questions the extent to which clever seat reservations can reduce the distances travelled. All other factors are assumed to be invariant and we abstract from too fine details in the model.

Assumption 1
We assume that the assignment of a track to a train is immutable and a direct consequence of the given rail network.
Assumption 2 We assume that each railcarriage has exactly one entrance. Without loss of generality, the entrance is positioned in the middle of the railcarriage.
For purely technical reasons, we assume that the train line can be determined from the number of a railcarriage. In reality, a railcarriage is used several times a day on different train lines.

Assumption 3
Each railcarriage is only used in one train on one train line.
A model that takes into account individual travellers could easily be provided with in-depth details such as train departure times, waiting times, and alternative connecting trains. This would directly imply that route planning is also done as part of the optimisation. As this is outside the focus of the problem considered here, we refrain from providing this information and assume that the individual travel plans of the passengers were preceded by corresponding considerations.

Assumption 4
We assume that the temporal coupling of trains at a station arises from the existing individual travel plans.
In the literature, seat reservations have so far been considered mainly in the context of group reservations, without which reservations would be a simple problem in itself (Clausen et al. 2010). In this introductory paper, we do not consider group reservations because we first want to measure the complexity of the problem that arises from considering transfer paths alone.

Assumption 5
We assume that only individual reservations are possible.
This limits the directly applicable practical value of the considerations presented in this paper. Nevertheless, group reservations could be considered in a practical implementation. One possible approach could be direct reservation for groups and downstream optimisation for individual travellers.

Modelling the infrastructure
This section introduces the concepts of rail infrastructure, in particular stations, train routes, and trains deployed. The operating conditions at the station can then be used to define the transfer costs for a passenger.

Notations
In order to keep the model clear and to increase readability, the notations are summarised compactly in three tables. Table 1 contains the sets, indices, and parameters. Table 2 shows the relations and functions used to access certain properties of the objects in the model. Table 3 contains the single decision variable that is used in all three variants of the problem.

Station configuration
Definition 1 (Station) A station st consists of several platforms st.pl ⊂ ℕ , each of which usually has two tracks. Each platform pl ∈ st.pl has positions pl.pos ⊂ ℕ that correspond to sections of the length of a railcarriage and that are labelled by a continuous sequence of numbers. In addition, the station has a cross connection between  Example 1 Figure 1 shows an exemplary station with two platforms. From platform 1 you can board the trains on tracks 1 and 2. At position 3, there is a cross connection to access the platform. Platform 2 illustrates the freedoms of the model: The platform does not start at position 1, but still has the cross connection at the same position as platform 1. In addition, platform 2 serves three tracks. A train on track 4 can only stop at positions 8, 9, and 10.
In this paper, only transfer paths resulting from seat reservations on one or more trains are considered and minimised. This means that unavoidable path sections Computes the position of a railcarriage at the platform from the operation conditions pl, pos, and dir occ Function used in the one-station problem to parameterise the number of already occupied seats in each railcarriage of the transfer are not taken into account when calculating the costs. Because in the static view of the paper, due to Assumption 1, the track of each train stop is unchangeable, the distance between two platforms cannot be minimised and is not included in the subsequent definition of the costs. We also do not take into account the distance within a railcarriage to the actual seat, but are only interested in the routes to the entrance of the carriage (Assumption 2).
In order to achieve the lowest possible transfer costs for all travellers, we define the transfer costs as squared distances. As we will later consider the sum of all transfer costs, the squared cost function should penalise very long journeys by individual travellers-a technique borrowed from linear regression (cf. Downey 2014, p. 118).
Definition 2 (Avoidable distance cost when changing trains) A passenger who has to walk from position pos on platform pl to position pos ′ on platform pl ′ at station st has the avoidable squared distance cost Because we only take into account the distance travelled on the platforms and not between the platforms, the platform numbers only serve as identifiers. The further definitions do not require a consistency check regarding the number of platforms at a station or the length of the platforms. Figure 2 illustrates two scenarios for calculating distance costs. In scenario (a), the trains stop at the same platform and there is a cost of (2 − 5) 2 = 9 . In scenario (b), the trains stop at different platforms. In that case, only the horizontal distances are taken into account for the calculation of the distance costs. This leads to the costs (‖4 − 3‖ + ‖5 − 3‖) 2 = 3 2 = 9.

Train configuration
To introduce the trains that connect the different stations, we distinguish between two different concepts. We first introduce the pure train route as a sequence of the stations visited. This is initially detached from the physical train, which is addressed as the second concept in Definition 5. The train route is also often referred to as a train line, which usually means the abstraction of several trains serving the same sequence of stations at different times. The definition of the train route implicitly includes a timetable and can therefore only be implemented through one train per day.
Definition 3 (Train route) A train route tr is defined by its timetable with a sequence of tr.len stations (2) ⟨tr.st 1 , … , tr.st tr.len ⟩ ∈ ST * with tr.st i ≠ tr.st j for i ≠ j . A train route tr defines a partial order < tr on the set ST as follows: Let TR be the set of all considered train routes. For the sake of simplicity, ST tr ⊆ ST denotes the set {tr.st 1 , … , tr.st tr.len }.
The notation using the Kleene star in ST * denotes the set of all sequences consisting of an arbitrary number of elements in ST .
The train actually used on a train route is introduced in the following definition by its railcarriages with their seating capacities. The separation of the train from the train route makes it possible to easily replace a train with a differently sized train in later work.
Definition 4 (Deployed train) A train route tr is implemented by a deployed train. For this purpose, a concrete sequence of railcarriages is assigned to the train with the set W containing all possible railcarriages. For the sake of simplicity, W tr ⊆ W denotes the set {w 1 , … , w }.
Because of Assumption 3, for tr, tr � ∈ TR with tr ≠ tr ′ it holds: train(tr) ∩ train(tr � ) = � . As a consequence, the following definition introduces the properties of a railcarriage within a train. Example 4 In the complex and closely interwoven rail network of Europe, the operation of a train and the order of its railcarriages is not trivial. Figure 3 shows a simple example with three stations. While the railcarriages at stations A and B are arranged with the direction of travel ( ↙ ), the train changes its direction of travel at station B, arriving at station C in reverse order ( ↗ ). However, if the train has to take the detour shown in dashed lines, it will keep its original direction at station C as well.
Caution: The direction of travel is a concept that refers exclusively to one station of the train. So this direction of travel actually depends on the side from which a train enters the station and the physical orientation of the train. In the following examples we will simplify and derive the direction of travel solely from the physical orientation.
The following definition determines how the positions of the railcarriages are actually calculated from the operation at the station.
Definition 7 (Carriage position) A train route tr using train(tr) with railcarriages W tr stops at station st according to position p = pos(tr, st) and direction d = dir (tr, st) . Then, when boarding and alighting, a railcarriage w ∈ W tr stops at platform position

Example 5
The resulting position of the railcarriages is shown in Fig. 4 for two trains. Depending on the variables pos and dir, the four cars are positioned differently on the platform.
This completes the introduction of the necessary infrastructure. The deployed train and its operation at the station provide all the information needed to calculate the distance cost for a passenger who leaves a railcarriage and goes to the exit (at the connecting corridor of the station).

One-station transfer problem
This section introduces the one-station transfer problem (or one-train-one-station problem) as a first, simple version of the transfer problem. It only considers multiple passengers boarding one train at one station.

Optimisation problem
Each passenger has a fixed point of arrival, either at the entrance to the platform or at an arbitrary position when actually transferring from another train. In addition, only selected seats are still available on the train in question. Now an allocation of the passengers to the free seats in the railcarriages is to be determined for which the sum of the distance costs is minimal.
Definition 8 (One-station transfer problem) A train route tr is implemented by train(tr) = ⟨w 1 , … , w ⟩ and operated at station st ∈ ST tr according to pl * = pl(tr, st) , pos * = pos(tr, st) and dir * = dir (tr, st) . In each railcarriage w j ( 1 ≤ j ≤ ), there are occ(w j ) ≤ w j .cap seats permanently occupied. Given is the set of passengers P where each p ∈ P arrives at platform p.pl in ∈ ℕ at position p.pos in ∈ ℕ ( 1 ≤ i ≤ n ) and transfers to train route tr. Then, in the one-station transfer problem, an assignment is sought such that is minimal over all possible mappings Map.
Example 6 A small problem instance is shown in Fig. 5a, where there are three railcarriages with a capacity of 10 seats. Up to seven seats are already occupied in the individual railcarriages. There are 10 arriving passengers. A solution is shown in Fig. 5b with minimal cost 7 ⋅ 1 2 + 1 ⋅ 2 2 + 2 ⋅ 0 = 11.

Runtime complexity and problem difficulty
Because there is a close link between assignment problems and minimum flow problems (Ford and Fulkerson 1956;Gabow and Tarjan 1988;Kennedy 1995), this paper uses network flow as a modelling and solution approach for the problems presented. The simple problem variant described in this section is equivalent to weighted bipartite matching.
Example 7 Figure 6 shows how the transfer problem of Fig. 5 can be modelled by an minimum-cost flow problem (MCFP). There is a vertex for each position at the platform in which the number of arriving passengers are induced using the incoming Because there are known polynomial-time algorithms for weighted bipartite matching, the following corollary holds.

Corollary 1 The one-station transfer problem is in P.
This was shown for minimum-cost bipartite matching by Kuhn (1955) via the Hungarian method with runtime O(n 4 ) (where n is the number of vertices). Gabow and Tarjan (1988) presented a maximum-flow algorithm with runtime O( √ nm log(n ⋅ C)) , where in addition m is the number of edges and C bounds the absolute value of the integer cost values. Schwartz et al. (2005) presented an approximation algorithm that solves the problem in O(n 2 ) with high probability. It is likewise an open question whether further properties of the problem instances can be used to design more efficient exact algorithms for the one-station transfer problem.

One-train transfer problem
In this section, the problem is generalised by considering a train with all stops. Each passenger enters the train at one station and leaves it at a later stop. The problem is referred to as one-train transfer problem (or one-train-many-stations problem).

Optimisation problem
The following definition introduces the extended problem. Besides additional information for the passengers and some technical conditions, the costs for the footpaths of all passengers at all stations are summed up as a major change.
Definition 9 (One-train transfer problem) A train route tr serving the stations ⟨tr.st 1 , … , tr.st tr.len ⟩ is implemented by train(tr) = ⟨w 1 , … , w ⟩ and operated at stations ST tr according to the functions pl (platforms), pos (positions) and dir (directions).
Given is the set of passengers P where each passenger p ∈ P is travelling using train route tr • from station p.st in ∈ ST tr at platform p.pl in ∈ ℕ and position p.pos in ∈ ℕ • to station p.st out ∈ ST tr at platform p.pl out ∈ ℕ and position p.pos out ∈ ℕ where p.st in < tr p.st out .
Then, in the one-train transfer problem, an assignment is sought such that is minimal over all possible mappings Map.

Example 8
We consider a small, highly simplified example on the train route of Fig. 3 with three stations. The train consists of three carriages with two seats each. The upper part of Fig. 7 shows the sections where seven passengers arrive at the platform and leave the platform. Inside the station boxes the placement and travel direction of the train is shown as well as one solution for the assignment of the passengers to the railcarriages. The passenger's distances going from station A to station B equals a cost of 1 when boarding and a cost of 0 when alighting. Passengers from A to C have a cost of 2 ⋅ 1 when boarding and a cost of 2 ⋅ 2 2 + 1 when alighting. Passengers from B to C have a cost of 2 ⋅ 1 + 2 2 when boarding and a cost of 2 ⋅ 1 when alighting. This brings the total cost to 20.
The passenger from A to B could also be placed in the first railcarriage at the same cost. However, moving one of the three passengers travelling from A to C from the third railcarriage to one of the other railcarriages would significantly increase the cost of boarding the passengers at station B. This would also increase the total cost: If only one passenger is placed in the third railcarriage at station A, the cost would be at least 22.

Algorithmic approach using maxflow
Since the single transfer at a station corresponds to weighted bipartite matching, it makes sense to model the one-train transfer problem in principle as a minimum-cost flow problem (MCFP) using a sequence of matching problems. Figure 8 illustrates the basic idea: • Passengers transferring at a station are modelled by complete bipartite subgraphs, similar to the example in Fig. 6 for the one-station transfer problem. In Fig. 8, these are the incoming edges to station A and the outgoing edges from station B. • The movement of the train between two stations is modelled by edges that determine the new position for each railcarriage. In Fig. 8, these are the edges between station A and station B.
However, Fig. 8 shows that such naïve modelling is not sufficient: Because the passengers are required to exit at station B in reverse order (see the left part of Fig. 8), the correct minimum cost solution has transfer cost 2. But when the two Fig. 7 Small example for the one-train transfer problem: Given problem instance with a capacity of 2 in each railcarriage. The arrows within the station boxes indicate a solution with minimal transfer cost stations are lined up as subsequent matching problems, the direct assignment of the individual passenger to the boarding and alighting position is lost. As a consequence the sketched solution in the right part of Fig. 8 is computed that has the minimal transfer cost 0. The naïve modelling as MCFP only ensures that the correct number of passengers enters and leaves the train. Instead, the modelling must be extended to include differently labelled items, leading to an instance of the minimum-cost multicommodity flow problem (MCMCF) (Ahuja et al. 1993). A separate commodity is used for each combination of traveller origin and destination. Figure 9 shows an instance of the minimum-cost multicommodity flow problem that can be used to solve the small example with three stations given in Fig. 7.

Example 9
We note that the one train transfer problem has more complexity than the mere sequencing of one station transfer problems. This also means that we cannot yet offer a polynomial-time algorithm for the one-train transfer problem, since MCMCF is known to be NP-complete (Even et al. 1976). Besides the open research question whether the one-train transfer problem still lies in P , efficient heuristics and approximation algorithms are also a short-term research goal.

General train transfer problem
In this section, the problem is extended again by combining different trains. The problem is referred to as general train transfer problem. Fig. 8 Counterexample for reducing the one-train transfer problem to minimum-cost flow problem (MCFP). The left part shows the transfer problem and the right part the respecting model using two subsequent one-station problems. The first number at the edges corresponds to the capacity. The second number to the cost per transfered unit. The incoming and outgoing edges have a lower capacity bound noted as 1

Optimisation problem
To generalise the problem, the following definition introduces several trains and the individual travel plans of each passenger. As with the result of a timetable query, each passenger uses a combination of individual journey sections of the trains and thus determines the transfer between the trains.
Definition 10 (General train transfer problem) Given is the set of train routes TR implemented according to train and operated in the stations in ST according to pl (platforms), pos (positions) and dir (directions). The set P describes all passengers where each p ∈ P is travelling with an individual travel schedule p.route of length p.len:  In the general problem, we assume that passengers always enter the system at the cross connection. Therefore, platform 0 is used for all arriving passengers and the position of the cross connection p.st 1 .cross is used as the position. The same applies to passengers leaving the system at the destination station.
Unlike time-expanded graphs for travel connection queries (Pyrga et al. 2007), we do not specify departure times at all and only link trains by travelling passengers (Assumption 4).
This property enables the modelling of time anomalies: After changing trains several times, we arrive back at the first train at the beginning of our journeysimilar to the time loop in the movie "Groundhog Day". However, all examples below (including instances generated from other problems) do not exhibit such time anomalies. Figure 10 presents a small connection network with three trains and eight stations. Train t 1 serves stations A, B, C, D, and E and has two railcarriages with a capacity of 3. Train t 2 serves stations F, G, D, and H and has three railcarriages with a capacity of 2. And train t 3 travels directly from G to B and has one railcarriage with a capacity of 3. A possible solution with minimal transfer costs is shown in Fig. 11. The aim is to keep individual contributions to boarding, transferring, or alighting as small as possible and to avoid the values 2 2 = 4 or 3 2 = 9 . In the given solution, cost 4 occurs when passengers e 1 , f 1 , and h disembark; they are unavoidable because, due to the railcarriage capacities, passengers have to be placed at higher costs. The solution has total cost of 33.

Solution using max flow
Let us assume here that the general problem should also be solved via a maximum flow problem. In principle, an MCMCF could be used as in Sect. 6.2, where each commodity corresponds to one passenger. However, the difficulty arises that a flow does not have to follow the specified route of a passenger's individual travel plan. Instead, the maximum flow is used to reproduce the timetable query, since there are several paths in the graph that connect a starting point and an ending point via Table 4 Passengers as part of the general train transfer problem given in Fig. 10 passengers route passengers route Fig. 11 One possible solution for the instance of the general train transfer problem given in Fig. 10 and Table 4. The labelling on the edges shows the occupancy of the railcarriages. The transfer costs for boarding, transferring and alighting are shown at the nodes different trains. However, the problem to be solved requires that the routes of the passengers be implemented directly. As a countermeasure, we need to expand the MCMCF with individual lower capacity bounds for each commodity (cf. Cappanera and Frangioni 2003). This way, the basic path can be described for each passenger.
We note here that while this is a fundamentally viable path for possible optimisation, we do not pursue this idea further. Modelling with the tight restrictions leads to overloaded maximum flow problems, which calls efficiency into question-especially since the theorem in the following section rather suggests the use of metaheuristics. Nevertheless, other optimisation techniques, e.g. mixed-integer programming solvers, might be able to efficiently solve problem instances of moderate size or with special properties of real-world scenarios.

Computational complexity
To assign the general train transfer problem to a problem class of computational complexity, we first transform it into a decision problem. This leads us to the following statement, which suggests that there is no algorithm for solving the problem in polynomial runtime.

Theorem 2 (NP-hardness of the general problem) The decision variant of the general train transfer problem is NP-hard.
The proof of the theorem is developed in the following sections.

Idea of the reduction
A common proof technique for the NP-hardness of a problem A is the reduction of a known NP-hard problem B to problem A. This means that from the detailed information of a problem instance of B a problem instance of A is calculated, so that from its solution the solution of the problem instance of B can be inferred. The conversion itself, however, may require at most polynomial time.
We prove the theorem by constructing a polynomial-time reduction from 3-SAT, the satisfiability of a formula in conjunctive normal form, to the general train transfer problem. However, we restrict the considered space of admissible formulas with the following assumptions.

Assumption 6
Each clause has at least two literals.

Assumption 7 All literals in a clause use different variables.
Without loss of generality, only formulas are considered that have clauses with two or three literals. Furthermore, a variable may appear only once in each clause. Clauses with only one literal imply the truth value of the associated variable and the formula can be simplified (cf. Hromkovič 2004, p. 185).
In short, we translate the propositional 3-SAT-formula into a train route network, where the clauses are assigned to stations and the left-to-right reading corresponds to the travelling direction of the involved trains. Simplified, the logical variables correspond to passengers boarding, transferring or terminating their journey at the stations that contain the variable in their associated clauses. For each clause (or its corresponding station), a train is set up to transport the up to three variables of its literals to their respective next clause/station: clause c k is included as a stop for the train starting at clause c i ( i < k ) if there is a common variable in c i and c k that does not occur in clauses c i + 1, … , c k − 1 . For each variable considered in this construction, the corresponding passenger's itinerary requires the use of the trains thus defined.
A minimum-cost solution to the transfer problem must correspond to an assignment of the logical variables to truth values that satisfies the 3-SAT problem. Therefore, each deployed train contains a railcarriage that carries the variables whose assignment satisfies the clause c i corresponding to the first station of the train. An additional railcarriage carries the remaining variables that do not satisfy the clause. Depending on the size s of the clause c i , the capacity of the railcarriage with the satisfying variables is set to s and that of the railcarriage with the non-satisfying variables is set to s − 1 . Consequently, at least one variable must be placed in the satisfying railcarriage. If a variable changing to another clause due to its travel plan is negated in the next clause, the direction of travel at the station is inverted: consequently, a variable satisfying the previous clause is classified as a non-satisfying variable in the next train route (and vice versa) with transfer cost of 0. If several negated and non-negated variables enter the next clause, additional precautions must be taken, which are discussed in the next section.
Furthermore, the same costs (amounting to 1) are incurred for each passenger when entering the system, regardless of whether the respecting variable is assigned the logical value true or false. The same applies when leaving the system. Thus, satisfactory assignment to the logical variables of the 3-SAT formula corresponds to an occupancy of the seats where only the costs for the beginning and the end of each trip of a variable are incurred, i.e. the cost 2⋅(number of variables).
If the minimum-cost solution to the train transfer problem serves all passengers and has higher costs, the truth value of one of the variables changes during the train journey and there is no satisfying assignment for the 3-SAT formula.

Reduction algorithm
Algorithm 1 presents the individual steps of the reduction. The technical details are explained in more depth in the rest of this section.
For each clause, a train with three railcarriages is introduced in step 2(a). The middle railcarriage has capacity of 0 and is positioned at each station at the cross connection where passengers arrive and leave the station. The first railcarriage takes the satisfying variables, the last railcarriage the non-satisfying ones. In a clause with three (resp. two) literals, the first railcarriage has 3 (resp. 2) seats and the last carriage has a capacity of 2 (resp. 1).
Since checking whether there is a satisfying literal in each clause corresponds to the seat occupancy during the train's journey, the corresponding train must travel a distance. Therefore, for the i-th clause, a station S ′ i is introduced next to the train route's starting station S i , which serves as the train's first stop [steps 1(a) and 2(b)]. If the i-th clause is the first clause containing a variable, the respecting passenger boards the train in station S i . If the i-th clause is the last clause containing a variable, the respecting passenger alights the train in station S ′ i [step 2(b)]. All trains stop at the same platform and the same positions within the stations, aligned with each other. As a consequence there are no costs for changing between the first railcarriages in the same direction of travel; analogously for changing between the last railcarriages. Figure 12 illustrates how the route of the train is constructed from a clause in step 2(b). Since the variables involved in clause 3 are negated (related to the last clause visited), the train must change its direction of travel [step 2(c)]. Variable x 3 is also negated in clause 4, but since the train has already changed to the negated state before, the direction of travel must not be inverted again. A renewed change of direction would be necessary if clause 4 would say x 3 instead of x 3 .
Obviously, this construction is not sufficient when both negated and non-negated literals occur in the next clause because the inversion of train affects all involved variables. To solve this problem, we introduce an additional train [steps 2(d)-2(f)]. Let j be the current clause and k be the next clause. We introduce a new train that also starts from station j ′ to station k, but arrives there in the opposite direction. Figure 13 shows this technique for the variables x 1 and x 2 , of which only x 2 is negated in the next clause. At the additionally introduced station 1', the passenger belonging to x 2 boards the newly introduced train. Thus x 1 and x 2 arrive at different positions at station 2, although they were assigned to the same railcarriage in clause 1. The new train has the same capacity restrictions as the original train; however, this does not imply any restriction as only passengers of the original train will transfer to the new train. Tables 5 and 6 shows for all possibles cases how the train routes and the additional train are introduced.
Algorithm 1 Reduction of a 3-SAT instance to an instance of the general train transfer problem.
3SAT-to-Traintransfer( formula F) computes a set of stations ST , a set of train routes TR , their implementation train, and a set of passengers P in such a way that the solubility of the train transfer problem indicates the solubility of the 3-SAT formula. Let F ≡ c 1 ∧ … ∧ c k be a formula with k clauses and literals from the set of variables V. According to assumption 6, the i-th clause c i has the form ( i,1 ∨ i,2 ) or ( i,1 ∨ i,2 ∨ i,3 ) .

(initialisation of stations and passengers)
(a) For each clause in F two stations are generated: For each variable x ∈ V : create a passenger p x , search the first clause c i where x is used, and initialise the travel route p x .route = (S i ) . Set P = {p x | x ∈ V}.
(c) Iterate through the clauses and, for each variable x, create a sequence clauses x that contains the clauses with x or x as a literal.
(d) Initialise TR = � and train accordingly. Exactly one passenger is created for each variable (step 1(b)) and the trains of the itinerary are composed according to the construction of the trains explained above (steps 2(c) and 2(f)). For each passenger, his last station is the intermediate station of the last clause in which the according variable occurs. Each passenger enters the station of his first clause at the middle position of the railcarriages and leaves the last station also at the middle position. Tables 5 and 6 shows the resulting passenger routes for the complex cases in step 2(f).

Proof
The proof of Theorem 2 follows directly from the two lemmata discussed in this section, which determine the correctness and algorithmic runtime complexity of the reduction. Lemma 3 Algorithm 1 performs a correct reduction from 3-SAT to the general transfer problem.
Proof The construction of an instance of the train transfer problem ( ST , TR , train, P ) from a given 3-SAT instance F with variables V was presented in detail in the previous section.
Suppose F is satisfiable and A ∶ V → {true, false} is a satisfying assignment. Then we show by contradiction that there is also a corresponding train reservation with the minimum cost 2‖V‖ . So we assume that the minimum cost solution requires a transfer cost > 2‖V‖ . This is only the case if a passenger does not reboard at exactly the same position at a stopover, which means that the associated variable changes its truth value between two clauses. However, this case is only possible if all passengers want to get from a railcarriage with capacity 3 (or 2) to a railcarriage with capacity 2 (or 1) due to different directions of travel. However, this means that the clause corresponding to the station is not satisfied with the initial fulfilling assignment, which implies that the entire formula is not satisfied. Yet, there is a satisfying assignment A and, due to the construction in the reduction algorithm, we know that in each clause the literals evaluated true by A can be placed in the Fig. 12 Example visualizing the construction of a train from the first 3-SAT clause Fig. 13 Example for the introduction of an additional train. Both x 1 and x 2 re-appear in the next clause but only x 2 is negated. Therefore, the passenger for x 2 will be routed using the additional lower train, and x 1 will stay in the original train Table 5 Different cases for the processing of a clause c i for which exactly two variables reappear for the first time in the same clause in the remainder of the formula. The clause c i contains either literals , ′ , and ′′ (in any order) or only literals ′ and ′′ associated first railcarriage. Because of this contradiction, the assumption must have been wrong.
Conversely, let the solution of the train transfer problem have transfer cost 2‖V‖ . Assume that there is no satisfying assignment A for the formula F. Again, we know from the construction of the reduction algorithm that missing transfer costs in the changeover mean that the truth value of the variables involved remains consistent. In addition, there must be a satisfying literal in each clause due to the capacities of the railcarriages. This contradicts the assumption, which must therefore also be false. This proves the statement of the lemma. ◻

Lemma 4
The runtime complexity of Algorithm 1 is in P.

Proof
In the runtime analysis we use of the following simple facts: All clauses are bound by length 3 ( ∈ Θ(1) ) and the number of clauses ‖C‖ in F is bound by ‖F‖ as given by the equation 1  Since ‖V‖ ≤ ‖F‖ , the total runtime is in O(‖F‖ 2 ) and, thus, the reduction algorithm in P. ◻

Examples
Finally, the polynomial reduction of 3-SAT to the general train transfer problem will be illustrated with two small examples.

Example 11
Consider the following small propositional formula in 3-CNF with variables {x 1 , x 2 , x 3 } and four clauses.
The conversion of the 3-SAT-formula to the general train transfer problem results in the problem instance shown in Fig. 14.
There are two solutions for this problem instance that are shown in Table 7. All the transfer cost between the trains are 0. And, therefore, the total cost is 6 for boarding the first train and leaving the trains at stations 3b and 4b.
Solution 1 corresponds to the satisfying assignment x 1 , x 2 , x 3 = 0 for formula (16): x 2 is seated in the first railcarriage of train 1, which means that it satisfies clause 1. Being a negated literal in clause 1, x 2 corresponds to the truth value 0 (false). Both x 1 and x 3 are placed in the third railcarriage of train 1 and are nonnegated literals in clause 1. Consequently, both variables have the truth value 0 (false) as well, since they do not satisfy the clause.

Example 12
We consider a small propositional formula in 3-CNF with variables {x 1 , x 2 } and four clauses. This is a minimal example for a formula that cannot be satisfied. Figure 15 shows the resulting instance of the the general train transfer problem. There are only solutions with the minimum cost 8, which is composed of the cost 4 for the start and end of the journeys and the cost 4 for a transfer to a carriage at another position on the platform. All solutions are shown in Table 8. The first four solutions meet the capacity limits of the railcarriages as long as this is possible; a transfer operation with cost ≠ 0 is only carried out if the capacity of a railcarriage would be exceeded. The passengers placed in this way are written in bold.
The solutions in the lower part of Table 8 already perform an expensive, nonpositional placement of a passenger at an earlier stage. All these solutions are variations of the first four solutions.  Table 7 Solutions for the problem instance in Fig. 14 There are two solutions for the given problem. The initial seat occupancy of the individual carriages is displayed in the fields of the table, with the railcarriages separated by   Solution  Train 1  Train 2  Train 2a  Train 3  Train 3a  Train 4 1 There is no solution to this problem with cost 2⋅(number of variables)= 4 . Consequently, the given instance of the decision problem with cost max = 4 cannot be satisfied. This corresponds to the unsatisfiability of formula (17).

Fig. 15
Resulting problem instance for formula (17). The individual itineraries are given by the labels at the sections of the trains Table 8 Solutions for the problem instance in Fig. 15 There are ten solutions for the given problem with transfer cost 8. The initial seat occupancy of the individual carriages is displayed in the fields of the table, with the railcarriages separated by . The transfer causing the internal cost of 4 is marked in bold face

Conclusions
This paper introduces a new, previously unexplored optimisation problem for train connections with seats to be reserved, which minimises walking distances when transferring between trains. With the increasing degree of digitisation, these considerations could soon reach a relevance threshold for real-world application.
The train transfer problem is defined in three variants which reflect the radius of operation. In principle, all three problem variants can be mapped to minimumcost maximum flow problems with different extensions. However, while the simplest problem belongs to the complexity class P, NP-hardness is shown for the general problem in this paper.
While this work deals exclusively with the modelling of the new optimisation problem and its theoretical classification, it opens up a large number of research questions to be investigated in the future. This includes the development of heuristics and approximation algorithms, the identification of benchmark problems, and possibly the development of realistic problem generators that produce problem instances tunable by difficulty. The primary goal is to determine through in-depth research and empirical studies to what extent such problem cases can be solved on a larger scale. Furthermore, possible metaheuristics raise the question of the costbenefit ratio: How much computing time must be invested to generate acceptable approximations, and are these costs worthwhile compared to the benefits?
In a vision where reservations are taken via online procedures and then customerfriendly seats are assigned offline just before departure, the basic problem needs to be expanded. On the one hand, the problem must also be extended to group reservations and it must be examined up to which group size or number of group reservations such a two-stage procedure is still feasible. On the other hand, one of the strength of the method is the ability to react quickly to unfamiliar situations. In order to realise this, a non-stationary variant of the problem must be introduced, in which a change in the operating conditions leads to an online optimisation of the seat assignment that was initially optimised offline.
Funding Open Access funding enabled and organized by Projekt DEAL.

Conflict of interest
We have no conflict of interest to disclose that could have appeared to influence the work reported in this paper.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.