Quadratic and higher-order unconstrained binary optimization of railway rescheduling for quantum computing

As consequences of disruptions in railway traffic affect passenger experience and satisfaction, appropriate rerouting and/or rescheduling is necessary. These problems are known to be NP-hard, given the numerous restrictions of traffic nature. With the recent advances in quantum technologies, quantum annealing has become an alternative method to solve such optimization problems. To use quantum annealing, the problem needs to be encoded in quadratic unconstrained binary optimization (QUBO) or higher-order binary optimization (HOBO) formulation that can be recast as a QUBO. This paper introduces QUBO and HOBO representations for rescheduling problems of railway traffic management; the latter is a new approach up to our knowledge. This new approach takes into account not only the single-track lines but also the double- and multi-track lines, as well as stations composed of tracks and switches. We consider the conditions of minimal headway between trains, minimal stay on stations, track occupation, and rolling stock circulation. Furthermore, a hybrid quantum-classical procedure is presented that includes rerouting. We demonstrate the proof of concept implementation on the D-Wave Quantum Processing Unit and D-Wave hybrid solver.


Introduction
Railway transport is perceived as a more sustainable and ecological alternative to individual mobility [1,2].The increasing train traffic and other safety-related issues cause dispatching problems in case of disturbances which may lead to rerouting and rescheduling.Failure to resolve them quickly and efficiently can cause inconvenience for the passengers and increase the costs.No matter what the reason for the disturbance is (technical malfunction of traffic control, collision with car or animal, system -see [3]), the objective is to reduce delay propagation [4,5,6].
One can harness quantum computing for solving the railway rescheduling and rerouting problem formulated as an optimization problem.A promising heuristic algorithm is quantum annealing, which relies on the quantum adiabatic model of computation [7,8,9].Commercially available quantum annealers are provided by the D-Wave company [10].The problem of interest needs to be formulated as an Ising model, which in turn determines the coupling strength between the pair of qubits of the machine.After the system is evolved slowly enough for a particular duration, it is expected to be found in the minimum energy state, encoding the solution that minimizes the objective function of the problem.Any problem that is formulated as a quadratic unconstrained binary optimization (QUBO) can be easily transformed into an Ising model and solved using quantum annealing in principle.Since it is more natural to express problems using QUBO representation than the Ising model, it is desirable to find QUBO formulations for optimization problems to target quantum annealing [11,12].A generalization of QUBO is higher-order binary optimization (HOBO) representation that allows not only quadratic terms but also higher-order terms in the objective function.There has been some recent work on formulating HOBO representations for combinatorial optimization problems in the context of quantum optimization [13,14,15].
The motivation of this paper is to demonstrate that it is possible to encode typical railway infrastructure and traffic conditions as QUBOs and HOBOs, making the problems quantum computing ready.This paper is a follow-up of [16] which comprises railway rescheduling under disruptions on a single-track railway line encoded using QUBO.Now we remove the restriction of single-track lines, enabling also double-and multi-track lines on model trains traffic on stations.We use a parallel machine approach improved by rerouting, resulting in a hybrid algorithm.The presented representations for railway rescheduling and rerouting include the conditions of the minimal headway between trains, minimal stay on stations, station/track occupation, and rolling stock circulation.We use a classical procedure that mimics real-life rerouting practices together with quantum annealing to solve the rescheduling problem and end up with a hybrid algorithm.Although the detailed discussion of our approach concerns the railway rescheduling problem introduced here, similar approaches can be adopted for problems from other branches of operational research such as factory trolleys or electric busses rescheduling/rerouting.
There is a vast amount of research in the scope of resuming the railway system's capacity and proper functioning after a disruption; for a systematic review see [17].There are also publications in which other techniques like genetic algorithms and deep learning techniques are used.One may find out more in numerous review papers [18,19,20] in the scope of optimization methods to solve railway conflict management problems.Given the NP-hardness of such rescheduling problems and their complexity, it is very challenging to solve them on current computational devices in a reasonable time.We expect quantum computing to offer novel opportunities to overcome these limitations.
In our approach, we chose the parallel machine approach, where trains have a fixed route within the stations [21].The reason is that passenger trains have fixed platforms within the station, and the platform change is an extraordinary situation that affects passengers.For demonstration reasons, we start with an Integer Linear Programming formulation where we use order variables [22] to determine the order of trains leaving the station.Alternatively, for the QUBO and HOBO approaches, we use discrete-time units [23], in which binary variables describe whether the event happens at a given time.
Our paper follows other research efforts towards solving transportation-related problems using quantum annealers [15,24,25] or quantum approximate optimization algorithm [26] (QAOA) [13].HOBOs are considered in some mentioned papers for various transportation problems.However, up to our knowledge, HOBO formulation is considered to address the railway rescheduling problem for the first time.
The paper is organized as follows.Section 2 gives a brief overview of the railway system model, which consists of infrastructure and traffic.In this section, we present the notions and formalism to describe the problem of railways rescheduling.In Section 3 we present a linear programming representation, we set out the QUBO and HOBO formulations, and we describe our approach to rerouting.We demonstrate the formulations in Section 4 both theoretically and using numerical calculations.The last section contains conclusions and a discussion on the possibility of further development of QUBO and HOBO representations to address railway rescheduling.

Railway system model
Trains run according to a schedule along the routes.The route of the train is composed of stations and lines between them.The line consists of one or more parallel tracks, each split into line blocks.The latter we understand as a track section between two signaling utilities that can only be occupied by one train at a time.Stations consist of tracks interconnected by railroad switches (referred to as switches).Similar to line blocks, stations consist of station blocks -track sections at stations between two signaling utilities that can be occupied by only one train at a time.Trains are controlled by dispatchers who can reroute or/and reschedule them if necessary.By rerouting we understand the change of the track used by a train within a line or a station.By rescheduling, we understand the modification of the train departure time in a way to avoid conflict and maintain the feasibility of the timetable.(Note that we define infrastructure terms from train traffic perspective rather than their physical characteristics, which is not the usual description in transportation research -we keep this description to keep it coherent with our mathematical model aiming to make it more illustrative.) Two trains meet and pass (M-P) meeting at the same spatial location while following the same route in opposite directions.Similarly, two trains meet and overtake (M-O) when one train overtakes another.Depending on the type of railway line, M-Ps and M-Os may occur at stations and/or on lines.We distinguish single-track, double-track, and multi-track lines.On single-track lines, trains can M-P and M-O only at stations.The usual use of double-track line is such that trains are heading in one direction on one track and in the other direction on the other track (unidirectional traffic).It implies M-P possibility at stations and lines and M-O possibility only at stations.We also consider another use of double-track lines as two parallel single-track lines (bidirectional traffic).In this mode, trains can M-O on the line between stations while heading in the same direction on both tracks (this is at the cost of M-P possibility).The bidirectional mode may also be used on multi-track lines.
Regardless of the type of line, trains need to keep minimal headway -the distance between two trains following the same direction to preserve safety conditions.Such headway can be measured either in space or in time if taking into account trains' speeds.Trains can terminate at a station, set off from a station, have scheduled stop there, or pass it by.As conflict [27] we understand the situation that occurs when at least two trains compete for the same resource in terms of blocks (station or line) or switches.In our model, we aim to resolve optimally all conflicts by rescheduling and rerouting while keeping the safety conditions and limiting the schedule modification.
For model's simplicity, let us assume that the schedule is a pre-set sequences of blocks with departure times assigned.We will refer to this as the default settings; any change will be considered as rerouting and creation the new model.In real rail traffic, the schedule is assumed to be conflict free, and conflicts appear due to delays.We define delays as the difference between t(j, s in ) or t(j, s out ) -the actual time of entering or leaving particular station s by train j, and the scheduled time σ(j, s in ) or σ(j, s out ).In the rest of this section we use s * for either s in or s out .Hence, the delay is: Following [27], we split the delay into unavoidable d u (j, s * ) and additional d a (j, s * ) in the following manner: By unavoidable delay, we understand the delay from outside the model that is propagated through the network, not including any delay that may be caused by other trains' traffic and that can not be controlled in the model.(Unavoidable delay may be caused by accidents, technical failure, delay from outside the analyzed network, or delays of the trains affected by those on subsequent stations.)The additional delay comprises delays beyond unavoidable caused by solving conflicts due to traffic, which is in control of our model.The latter is of main interest to us as our goal is to minimize the additional delays.As we intend not to extend the delays, we assume that the additional delays are limited by the parameter d max is a parameter of the model and limits the range of the integer variables in the linear model and the number of variables in QUBO or HOBO approaches; as such, it affects the problem size.It should not be set too low, resulting in a situation in which obtaining a feasible solution is not possible.There are a few possibilities for determining d max .Following [16] and Tab. 1 therein, one can use some simple heuristics such as FCFS (first come first serve) or FLFS (first leave first serve) to get the solution that is not optimal but feasible.(As discussed in [27] these heuristics are often used in real live railway rescheduling.)Such simple heuristics and solutions can be used to determine d max for the practical problem.(Bear in mind that in an advanced model, d max may also be train and station dependent.)

Switch
Figure 1: A comprehensive yet simplified illustration of railway infrastructure.
A summary of definition of railway terminologies is given in Tab.6 in Appendix A. The comprehensive illustration of railway infrastructure is given in Fig. 1.

Problem formulation
In this section, we discuss the conditions that need to be satisfied and the objective function of the problem.The symbols used are summarised in Tab. 1.Following [16], our goal is to minimize the weighted additional delay where J is the set of trains, S j the set of stations passed by j, w j is the particular weight reflecting the priority of the j'th train.For implementation reasons it is more convenient to use: For clarity of presentation, we introduce the minimal time train j is ready to leave s provided the initial conditions and that no other trains are on the route and denote it by υ(j, s out ).By definition, υ(j, s out ) = σ(j, s out ) + d u (j, s out ) and where the first line follows by Eq. ( 2), the second line follows by Eq. ( 1) and the third line follows by Eq. ( 6).Now we can rewrite the objective function defined in Eq. (5) using Eq. ( 7) as As the objective is defined, we move on to constraints derived from train traffic safety conditions and other technical issues.
We start with the minimal passing time condition which ensures that for any pair of subsequent stations (s, s ), that is on the route of j ∈ J, the entry time to station s is exactly equal to the leaving time of station s plus the time it takes for train j to move from s to s , which we denote by τ (pass) (j, s, s ), see also Fig. 2. Note that we make an assumption that the train can leave s only if it can proceed at full speed to s .Given this, the condition can be stated as: t(j, s in ) = t(j, s out ) + τ (pass) (j, s, s ).
Next, we move to the minimal headway condition.Consider trains j, j heading in the same direction.To determine their order, we use the precedence variables y(j, j , s out ) ∈ {0, 1} that is equal to 1 iff j leaves s before j .(The precedence variable implementation appears to be more efficient than the order variable implementation [28].)Naturally, for any j, j ∈ J and s ∈ S j S j , it follows that y(j, j , s out ) = 1 − y(j , j, s out ). (10) Assume that train j leaves s before train j .Then j needs to wait for at least additional τ (blocks) (j, s, s ) which is the minimal time (headway) required for train j (traveling from s to s ) to release blocks to allow j to follow at full speed, see also simple illustrative presentation in Fig. 2.However, if j is slower than j , then an additional waiting time of τ (pass) (j, s, s ) − τ (pass) (j , s, s ) is needed.For all j, j ∈ J d -the set of pairs of trains heading toward the same direction on the same route -and (s, s ) ∈ C j,j -the set of subsequent stations in the common route of j and j -the condition can be expressed as follows: y(j, j , s out ) = 1 =⇒ t(j , s out ) ≥ t(j, s out ) + τ (blocks) (j, s, s ) + max{0, τ (pass) (j, s, s ) − τ (pass) (j , s, s )}.(11) In a single track line, a train can enter the single line only if it is cleared by the train approaching from the opposite direction -we call it the single track line condition.Similar to y, we define the precedence variable z(j, j , s, s ) ∈ {0, 1}, that determines which train enters first the single track line between s and s .Note that the following is true for all j, j ∈ J o single -the set of all trains heading in opposite direction on the same track -and (s, s ) ∈ C j,j .z(j, j , s, s ) + z(j , j, s , s) = 1. ( By τ (res.)(j, s), we denote the time of using the conflicted resource (i.e.set of switches) by trains j at station s, see also Fig 2. For all j, j ∈ J o single and (s, s ) ∈ C j,j , the single track line condition is expressed as: If the train is due to stop at the station, then it needs to wait at least τ (stop) (j, s), which is the minimal stopping time at the station s by train j, see Fig. 2. Apart from this, the train must not leave before its scheduled departure time.This is called the minimal stay condition.This results in the following conditions for all j ∈ J and s ∈ S j : t(j, s out ) ≥ t(j, s in ) + τ (stop) (j, s), (14) and t(j, s out ) ≥ σ(j, s out ).
We also use the rolling stock circulation condition analogous to the one discussed in [16].By τ (prep.)(j, j , s) we denote the minimal rolling stock preparation time, if train j terminates at s and then starts as new j , see Fig. 3.For all s ∈ S and j, j ∈ J round s -the set of pairs of trains that terminates at s and set off as a new train -we have the condition: t(j , s out ) ≥ t(j, s in ) + τ (prep.)(j, j , s). ( There are cases where two trains are to use the same set of switches at station s while entering the station, and leaving it.This is called the switch occupancy condition.This condition is (partially) integrated with the single track line condition (a common set of switches where a single line enters/ leaves a station) and track occupancy condition (a common track that can be occupied by one train only).Hence as J switch s we consider the set of pairs of trains that compete for the same switch or switch set not considered in other conditions.For all s ∈ S and j, j ∈ J switch s , this condition can be stated as: y(j, j , s * * * ) = 1 =⇒ t(j , s * ) ≥ t(j, s * * ) + τ (res.)(j, s), s s j j τ (blocks) (j, s, s ) s s j j Figure 2: Illustration of τ (blocks) , τ (pass) and τ (stop) , in our model they are in time units.In this demonstrative example τ (blocks) requires passing two subsequent block sections, which is rather usual for trains traffic management, but not the limitation of the model.(We do not consider here the length of the train.) where s * , s * * , s * * * may be s in or s out depending on the particular situation on the station.Two trains can not occupy set of switches at the station -Eq.( 17) In Eq. ( 17), s * , s * * may be s in or s out depending on the particular trains at a station, similarly y(j , j, s * * * ).For example if j and j compete for the common switch as j and j both leave s, we have s * = s * * = s * * * = s out .
There may be also other possibilities, e.g.including z variable instead of y variable, however we do not discuss them in this simple model.Now, let's discuss the track occupancy condition.As we are using a parallel machine approach, trains are assigned to particular tracks and station blocks that can be occupied only by one train at once.Consider two trains j 1 , j 2 that compete for the same track at the station.The subsequent train has to wait until the previous one leaves.This results in for all s ∈ S and j, j ∈ J track s .Here J track s is the set of trains that compete for the same track at station s.The additional term τ (res.) can be used if the two above-mentioned trains use the same set of switches (then the pair is excluded from J switch s ).

Integer linear programming representation
Based on the problem formulation presented above, we construct an integer linear programming (ILP) formulation.To linearize the implications, of the form a = 1 =⇒ b ≥ c, we use the transformation b + M (1 − a) ≥ c where M is a large constant.Furthermore, we use Eq. ( 10) and Symbol Description j ∈ J , (j, j ) train, pair of trains J d , J o single set of trains heading in the same direction on the same route, in opposite directions on the same track J track s , J switch s set of trains that compete for the same station block, switch at station J round s the set of all pairs of trains such that j terminates at s turn around and starts from s as j s ∈ S, (s, s ) station, pair of stations S j set of all stations in the route of train j C j , C j,j set of all subsequent pairs of stations in the route of j, common route of j, j σ(j, s out ) scheduled time of entering, leaving station s by train j υ(j, s out ) minimal time the train j is ready to leave s, provided the initial conditions and that no other trains are on the route d(j, s out ), d u (j, s out ), d a (j, s out ) delay, unavoidable delay, additional delay of train j on leaving station s d max maximum possible (acceptable) additional delay number of trains each train may be in conflict at each station, track or switch (on average) τ (pass) (j, s, s ) minimal passing time of train j between s and s (the time it takes train j to travel from s to s ) τ (blocks) (j, s, s ) minimal time required for train j (traveling from s to s ) to release blocks to allow another train to follow at a top speed τ (stop) (j, s) minimal stopping time at the station s by train j τ (prep.)(j, j , s) minimal rolling stock preparation time τ (res.)(j, s) time of using the conflicted resource (i.e.set of swishes) by trains j at stations s w j weight of train j in the objective p sum , p pair , p qubic penalty constants for HOBO / QUBO formulation.f objective function.M a large constant for linearization.
Table 1: Summary of the notations used in the paper to denote parameters of the model.
s Figure 3: Illustration of τ (res) and τ (prep) .Train j terminates at station s and the rolling stock is changed to another train j (upper panel).Train j occupies switch at station s, and such switch is not available for other train at that time (lower panel).
Eq. ( 12) for the simplification of the equations with precedence variables.We use the variables t(j, s out ), y(j, j , s out ) and z(j, j , s, s ) as defined previously.ILP takes the following form.

Symbol
Type Description t(j, s out ) integer time of train j on leaving station s t(j, s in ) integer time of train j on entering station s, uniquely determined by t(j, s out ) y(j, j, s out ) binary 0-1 1 iff j leaves s before j (determines the order of trains j and j while leaving station s) z(j, j , s, s ) binary 0-1 1 iff train j enters the single track line between s and s before j .
(determines the order of trains j and j while entering the particular track line between station s and s ) x j,t,s binary 0-1 1 iff train j leaves station s at time t.xj,j ,t,t ,s binary 0-1 auxiliary variable for HOBO quadratisation xj,j ,t,t ,s = x j,t,s x j ,t ,s subject to t(j , s out ) + M • z(j, j , s, s t(j , s in ) + M • y(j, j , s out ) − t(j, s out ) ≥ τ (res.)(j, s) t(j , s * ) + M • y(j, j , s * * * ) − t(j, s * * ) ≥ τ (res.)(j, s) The range for the integer variables t(j, s out ) follows since the following is true by Eq. ( 1), Eq. ( 3), and the definition of υ(j, s out ).
Although we use the variables t(j, s in ) for the clarity of the formulation, thanks to the first constraint, they are defined uniquely and not used when formulating the program.Given this, we have roughly a single time variable per station and train (but some trains may not serve all stations) and overall : Similarly, we define the precedence variables only for an ordered pair (j, j ) as the corresponding variable can be replaced using Eq. ( 10) and Eq. ( 12).This results in a single precedence variable y per station and train pair.
However, we do not need to compare all pairs of trains in case of dense train traffic, and the number of trains to be compared is somehow limited by d max .(There will be pairs that would never meet for given d max ).Let assume each train can be in conflict with trains at each station, track or switch (on average).N (d max ) is non-decreasing in d max .We have then the approximation: We also have some additional precedence variables e.g. for the single-track lines.Using similar approximation: This is however adequate if all trains use the single track line, otherwise, we can treat it as the limit.
The number of minimal headway Eq. ( 21), and track occupancy Eq. ( 26) constraints are both roughly equal to number of y variables, as each such variable concerns the conflict on these conditions.The number of single track line Eq.( 22) conditions is roughly proportional to number of z variables from the same reason.The number of minimal stay constraints Eq. ( 23) and Eq. ( 24) are both limited (or can be approximated) by |S| |J | (limit comes from the fact that not all trains serve all stations).
The number of rolling stock circulation constraints Eq. ( 25) is not large in comparison with others, for sure it is limited by |J |  2 (this would be a situation that one-half of the trains turn to another half).The number of variables in switch conditions, Eq. ( 27) is not straightforward, as there are many possibilities and approaches.We can again approximate them by the number of y variables.The number of constraints can be approximated/ limited by: Hence, one can conclude that if d max is set properly, the problem size should be linear in the number of trains and stations.
It is broadly accepted that railway problems are equivalent to job-shop models with blocking constraints, see eg. [29] (such job-shop is equivalent in principle to the set partition problemsee eg.[30]).In detail, in such an NP-hard problem, we have the release t i and due dates υ i of jobs, requirements of the model (blocking constraints), and there may be also some additional constraints such as no-waiting, and recirculation (rcrc).In our analogy, trains are jobs, and selected block sections are machines.With the standard notation of scheduling theory [31], our problem falls into the class J |t i , υ i ,block, no wait, rcrc| j w j T j .Above mentioned conditions comply with ours in the following way: 1. Eq. ( 19) is the objective, weighted tardiness with incorporated due time υ, 2. Eq. ( 20) is the no-waiting constraint on the line (on the station waiting is allowed), The presented linear programming approach is a standalone model.However, it fails in rapid computation for some models with more than a few trains [28].Hence it may be beneficial to use another computation paradigm, such as quantum (or quantum-inspired) annealing.As the alternative, in the next subsection we derive the HOBO representation directly form dispatching conditions (i.e.independently on ILP).

HOBO representation
A higher-order unconstrained binary optimization (HOBO) problem involves the minimization of a multilinear polynomial expression defined over binary variables where x denotes the vector of all binary variables x 1 , x 2 , . . ., x n , V = {1, 2, . . ., n} and c S are the real coefficients.It is also equivalently expressed as Pseudo-Boolean optimization [32] and polynomial unconstrained binary optimization [33].
The degree or order of a HOBO is the size of the largest set S. The problem is called quadratic unconstrained binary optimization (QUBO) when the degree is equal to 2, and the term HOBO is often used for higher-order problems.For the parallel machine approach adopted in this paper, we have the third order of HOBO.
To formulate the problem, we use the time indexing variable that is 1 if train j leaves station s at time t, and 0 otherwise (recall that each time index t can be represented uniquely by delay via Eq.( 1)).Note that, we use Eq. ( 9) to compute the arrival time from the departure time from the previous station.We use the discretised t that is limited from both sides by Eq. ( 30) and Eq.(31).We denote this limit by t ∈ T j,s , where T j,s ≡ {υ(j, s out ), υ(j, s out ) + 1, . . ., υ(j, s out ) + d max }, here we consider one-minute resolution.This limitation ensures the timetable condition in Eq. ( 15).We have the linear objective function defined as in Eq. ( 8): In our approach, we do not take into account recirculation, i.e. each train leaves each station s ∈ S j once and only once: To convert the constrained problem into an unconstrained one, we use the well-established penalty method [34].Constraints are incorporated into the objective function so that violation of the constraints adds a positive penalty to the objective function.For instance, to include the constraint in Eq. ( 40) in the objective function, we set a large enough penalty constant p sum and use the following penalty term: x j,t,s x j,t ,s − t∈T j,s x j,t,s Following [16], the conditions described in Eq. ( 9) -( 18) can be expressed using binary variables so that the quadratic terms yield 0 if the solution is feasible, and produces a penalty otherwise.
For this reason, we use a sufficiently large penalty constant p pair .Note that we have symmetric terms (x 1 x 2 + x 2 x 1 ) to follow the convention of symmetric QUBO formulation.
The minimal headway condition given by Eq. ( 10) and Eq. ( 11), can be expressed in the following form: P headway pair (x) = p pair j,j ∈J d (s,s )∈C j,j t∈T j,s ,t ∈T j ,s ,A<t −t<B (x j,t,s x j ,t ,s + x j ,t ,s x j,t,s ), where The single track condition defined in Eq. ( 12) and Eq. ( 13) yields: pair (x) = p pair j,j ∈J o single (s,s )∈C j,j t∈T j,s ,t ∈T j ,s A<t −t<B (x j,t,s x j ,t ,s + x j ,t ,s x j,t,s ), where A = −τ (res) (j, j , s ) − τ (pass) (j , s , s ), The minimal stay condition given in Eq. ( 14) (incorporated if necessary with Eq. ( 15)) yields: P stay pair (x) = p pair j∈J (s,s )∈C j t∈T j,s ,t ∈T j,s t <t+τ (pass) (j,s,s )+τ (stop) (j,s) (x j,t,s x j,t ,s + x j,t ,s x j,t,s ). ( The rolling stock circulation condition in Eq. ( 16) yields: P circ pair (x) = p pair s∈S (j,j )∈J round s t∈T j,s ,t ∈T j,s t <t+τ (pass) (j,s,s )+τ (prep.)(j,j ,s) (x j,t,s x j,t ,s + x j,t ,s x j,t,s ). ( The switch occupation condition in Eq. ( 17) yields: P switch pair (x) = p pair s∈S j,j ∈J switch s t∈T j,s ,t ∈T j ,s −τ (res.)(j ,s)<t −t<τ (res.)(j,s) (x j,t,s x j ,t ,s + x j ,t ,s x j,t,s ).(46) The above can be checked alone or integrated with other conditions such as track occupation condition in Eq. ( 18) and single track condition in Eq. (43).The order of trains can be changed at the station only if these trains use different tracks at the station.Suppose that j and j are on the same track at the station, hence they can not change order.To express this condition we need a higher order term, which yields a HOBO formulation.Let t = t(j , s out ), t = t(j , s out ) and t = t(j, s out ), where s is a station prior to s in the route of j .If j leaves before j , i.e. t < t ( t = t to prevent trains leaving the same track at the same time), then j must enter after j leaves i.e. t + τ (pass) (j , s , s) ≥ t + τ (res) (j, j , s).The following term needs to be 0: qubic (x) = 2p pair s∈S j,j ∈J track s t∈T j,s , t ∈T j ,s t ∈T j ,s t +τ (pass) (j ,s ,s)−τ (res) (j,j ,s)<t≤t x j,t,s x j ,t ,s x j ,t ,s .
We use the penalty value 2p pair to be consistent with the symmetrization.
The resulting HOBO representation is expressed as: min.h(x) =f (x) + P sum (x) + P headway pair (x) + P 1track pair (x) + P stay pair (x) + P circ pair (x) + P switch pair (x) + P occ.qubic (x), where f (x) is the objective function and the rest are the penalty terms that need to be minimized.
The penalty constants p sum and p pair has be large enough to ensure the constraints to be always fulfilled, regardless the penalty value in the objective.However, these constants cannot be too high; because, in that case, they may affect the performance of the quantum annealer.
The number of variables x j,t,s depends on the time resolution of the system and d max .It can be approximated by: # Here "≤" sign is used as some trains may not serve some stations.The number of variables here (as in ILP case Eq. ( 32)) should be linear in the number of trains and stations, but it is also proportional to (d max + 1).This can be however contrasted with the fact that in the ILP case, we have a broader range of time-indexed variables.As argued before, we assume that each train can be potentially in conflict with N (d max ) at each station, line or switch.

QUBO representation
A quadratic unconstrained binary optimization (QUBO) problem is formally defined as where Q is a real matrix of coefficients.To be able to solve a problem using quantum annealing, we must first encode it using QUBO formulation as current quantum annealers allow only two-body interactions and representation through Ising model.
In this section, we will convert the HOBO representation into a QUBO representation.Note that we formulate HOBO directly from the dispatching conditions.The advantage of such a take is that in HOBO, we have one-to-one relation between real dispatching constraints and penalties of the mathematical formulation of the problem.(Latter auxiliary variables are only used in quadratization of HOBO).Alternatively, to obtain a QUBO formulation for the problem, one can transform the ILP presented in Section 3.1 by first converting inequalities into equalities using slack variables and then moving equality constraints to the objective using the penalty method.The ILP formulation requires binary variables quadratic in the number of trains.Furthermore, for the transformation, additional slack variables are needed as many as the number of inequality constraints which is quadratic in the number of trains, and they need to be optimized within the model as well.Since our HOBO approach is linear in the number of trains, we think that using it as the basis of the QUBO formulation may be more adequate for dense railroad traffic, with rather small delays; for instance for metro, trams, and urban rapid transport.
The qubic terms in the HOBO representation need to be converted to obtain a QUBO representation.The cubic terms can be expressed using quadratic terms at the cost of introducing new binary variables, see [35].For the decomposition, we use the auxiliary variable xj,j ,t,t ,s = x j,t,s x j ,t ,s .The simplest approach here is to use the Rosenberg polynomial approach [36].The constraint: is equivalent to: i.e. k = k(i 1 , i 2 ).Then one can use the polynomial: that is 0 if xk = x i 1 x i 2 , and positive (equal to 1 or 3) otherwise.Using the auxiliary vector of variables x, the penalty terms will be as follows: where Γ is a set of particular indices of the cubic term (in Eq. ( 47)), and Γ a set of indices, where we require Eq. (51) to hold.Observe that for each pair of trains and for each station where the track occupation condition is to be checked; we have roughly (d max +1) 2 auxiliary variables.Hence, this condition needs to be used with caution while modeling railway systems of considerable size.
The resulting QUBO representation is expressed as: min.q(x, x) =f (x) + P sum (x) + P headway pair (x) + P 1track pair (x) + P stay pair (x) + P circ pair (x) + P switch pair (x) + P qubic (x, x), where f (x) is the objective function and the rest are the penalty terms that need to be minimized.
The number of x variables x j,t,s depends on d max .It can be approximated by: as we use the same approximation as in (34).We also have in mind that some trains may not serve some stations.When compared with Eq. ( 33), we can conclude that for the QUBO approach we need to control d max more strictly.QUBO implementation may still be efficient but for small d max determined, e.g., by some simple heuristics.

Rerouting formulation
We aim to solve the problem of setting the order of already delayed trains having limited resources in terms of infrastructure and traffic regulations.We follow the general idea set out in [37] where the widespread optimization problem needs to be decomposed into smaller components to demonstrate the supremacy of quantum or quantum-inspired approaches.In our case, we propose a decomposition that mimics some real-life rerouting practices.Namely, trains follow their default route as long as it does not cause distortion.Here we have the subproblem to be optimized by classical, quantum, or hybrid quantum-classical resources.If a solution is not satisfactory, we can change the path of the selected trains (aka reroute them) using the classical approach and then solve the new subproblem.We propose the following algorithm summarized in Fig. 4. The red region indicates the part that can be performed using the quantum (or quantum-inspired) resource at the current state of the art.As quantum computing becomes more and more advanced in the future, we will be moving the quantum border wider and wider to cover the whole algorithm finally.We start from the given infrastructure, schedule, maximal possible additional delay parameters, priorities of the individual trains, and the default train routes (aka default setting).Then we perform the optimization and check both feasibility of the solution as well as the objective value.If the solution is infeasible, we pick the nonfeasible conflict.Similarly, if we find the objective value too high, we pick the conflict, increasing the objective value the most.From this conflict, we pick one train (the one with lower priority) and reroute it by: 1. changing the track to the parallel one, 2. changing the platform at the station, 3. changing the path within the station.
We repeat the procedure until we get a satisfactory objective value or we achieve some stopping condition.The optimization subproblem (red) can be encoded either as a linear program, or following the QUBO or HOBO approaches.

Demonstration of the model
We consider a railway model which we depict it in Fig. 5.There are 2 stations s 1 and s 2 , a double-track line between them, and a depot; the switched are represented with c i .We have 3 trains: • Inter-City (the faster one): j 1 , s 1 → s 2 , • Regional: j 2 , s 1 → s 2 , • Regional: j 3 , s 2 → s 1 .
5. After entering s 2 , both j 1 and j 2 departs to the depot after the minimal stay.We only count delays of j 1 and j 2 at s 1 and delay of j 3 at s 2 .
Assume all trains are already delayed.Hence, they can leave the stations as soon as the resources (free rail track ahead) are available.We consider the objective as denoted in Eq. ( 8), with the following weights w j 1 = 2.0, w j 2 = w j 3 = 1.0 (Inter-City train has higher priority).We set d max = 10 for all trains, and use 1 minute resolution.The initial conditions are as follows: ν(j 1 , s 1out ) = 4, ν(j 2 , s 1out ) = 1, and ν(j 3 , s 2out ) = 8.We compute unavoidable delays and νs prior to the optimization.Particular departure times of the trains are in the following range: Now, we will investigate the linear programming approach and the time-indexed representation which leads to the QUBO formulation.For QUBO and HOBO representations, we use the following time indexed variables x j 1 ,t 1 ,s 1 : and we will replace the occurrences of the variables on the left hand side using Eq.(57) in the ILP formulation.Note that we use t 1 , t 2 , t 3 only to compute the penalty for the delays.
In the QUBO formulation, we have the following penalty term from Eq. ( 41) ensuring that each train leaves each station only once: As the default setting, we consider a double-track line, where each track has its own direction (unidirectional traffic).There is a conflict between j 1 and j 2 on the line from s 1 s 2 .If j 1 goes first at t = 4, then j 2 can start earliest at t = 6 (with an additional delay of 5) to proceed at full speed.If j 2 goes first at t = 1, then j 1 can start earliest at t = 7 (with an additional delay of 3) to proceed at a full speed.In both cases, j 3 can proceed undisturbed.
In the case of linear programming, the conflict can be resolved by setting the order variable y(j 1 , j 2 , s 1out ) ∈ {0, 1} to 1 if j 1 goes first and 0 if j 2 goes first.Recall that y(j 2 , j 1 , s 1out ) = 1 − y(j 1 , j 2 , s 1out ).Referring to Eq. ( 21), where M is a large number.Equivalently, t 2 − 2 < t 1 < t 2 + 6 is not allowed in the time-indexed variable approach.Hence, we have the following QUBO penalty term: We can express the minimal stay condition in Eq. ( 23) as and the corresponding QUBO term would be The track occupancy condition as defined in Eq. 26 for the track at platform 1 on station s 2 , see Fig. 5 (both j 1 and j 2 are scheduled on this track) is expressed as and we have y(j 1 , j 2 , s 1out ) = y(j 1 , j 2 , s 2out ) (64) as the M-P is not possible on this route (note that this last condition will be lifted while rerouting).In either case, the QUBO (HOBO) representation would be: and for the decomposition we use: xt * ).The first part of the qubic penalty function is given by: and where h is the polynomial from Eq. ( 52).We use Eq. ( 8) for the objective.The ILP takes the following form: subject to: and the range of the integer variables t 1 , t 2 , t * 1 , t * 2 are determined by Eq. (56).We use Eq.(64) for the simplification of the precedence variables.
Suppose now that we find the value of the objective not satisfactory.In this case, we need to perform rerouting.In our case, the rerouting will concern changing the double-track line to the bidirectional traffic mode (many railway operators are being involved in such rerouting, e.g.Koleje Ślaskie, eng.Silesian Railways).In details, there is a conflict between the trains j 1 and j 2 on the line between s 1 and s 2 .Hence rerouting will be used to solve this conflict: We use the line between s 1 and s 2 as two parallel single-track lines (Track 1 for j 1 and Track 2 for j 2 ).In this case, we have no conflict between j 1 and j 2 and we lift the conditions in Eq. ( 59) and Eq. ( 64) (as M-P is now possible on the line), or remove the corresponding terms from the QUBO in Eq. ( 60).However, a new conflict arises between j 2 and j 3 on the single track resource (Line 2), so new conditions or terms will appear.Following Eq. ( 22) the single track line condition yields: as τ (pass) (j 2 , s 1 , s 2 ) = 8, and τ (pass) (j 3 , s 2 , s 1 ) = 8.Equivalently we can not have t 3 − 8 < t 2 < t 3 + 8) and we have the following QUBO penalty term: The objective would be as in Eq. ( 68), but subject to altered constraints: and the ranges of the integer variables t 1 , t 2 , t * 1 , t * 2 are determined by Eq. ( 56).
If j 3 goes first (z(j 2 , j 3 , s 1 , s 2 ) = 0), the additional delay of j 2 would exceed the maximal d max = 10.The optimal solution is z(j 2 , j 3 , s 1 , s 2 ) = 1 and 1 , j 2 , s 2out ) = 1, hence t 1 = 4, t 2 = 2, t 3 = 11, and t * 1 = 9.The additional delay of j 1 is 0, j 2 is 1, and j 3 is 3 with the objective 0.4, which is better than the objective of the default settings.As there is no possibility to reroute trains further to lift the conflict between j 2 and j 3 , we can consider this objective as the optimal one.

Numerical calculations
In this section, we present a proof-of-concept by solving the small numerical example described above using D-Wave solvers.
We first solved the problem using the ILP formulation to test the validity of the model.We used Python 3.9 programming language and PulP library [38] to implement the ILP formulation and CBC (Coin-or branch and cut) [39] solver to solve the problem, which is the default solver in PulP, to test the validity of the model.We reached t 1 = 4, t 2 = 6 and t 3 = 8 for the default settings (with objective 0.5), see Tab. 3a, and t 1 = 4, t 2 = 2 and t 3 = 11 for the rerouting (with objective 0.4) as expected, see Tab. 3b.Note that we are not interested in the run-time comparison between the linear solver and D-Wave, but we would like to demonstrate the potential of quantum annealing for solving train rescheduling problems.Table 3: Solutions obtained from the linear solver.
We implemented the QUBO formulation presented in Section 3.3 using D-Wave Ocean SDK.For the numerical calculations on the D-Wave machine we need to pick particular penalty values.The theory of penalty methods is discussed, for example, in [34].In general, the solution of the unconstrained objective tends to be a feasible optimal solution to the original problem as the penalties of constraints tend to infinity.However, in practice, these penalties have to be chosen so that the constraint terms do not dominate over the objective.If the penalties are too high, the objective can be lost in the noise of the physical quantum annealer.Based on these heuristics, we used the following strategy in the determination of penalties: The terms in Eq. (69) (the maximal penalty here is w j 1 = 2.) are "soft constraints", and the terms in Eq. (70) and Eq. ( 74) are the "hard constraints" that can not be broken for the solution to be feasible.Hence, we use the following penalty parameters p sum = 2.5, p pair = 1.25 (as each element is taken twice) and p qubic = 2.1.
Both for the default settings and rerouting, we had 176 logical variables, out of which 55 were the x variables and 121 were the auxiliary x variables.Here we have a relatively large overhead due to the cubic term.Hence the single track occupation condition has to be used with caution when handling large railway problems.To test the validity of the model, we first solved the two problems using the simulated annealer (SA) from the D-Wave Ocean SDK, which is a classical heuristic algorithm for solving combinatorial optimization problems stated as QUBOs.When running SA or QA, the output is a list of samples (0-1 assignments to the binary variables) and the corresponding energies (value of q(x)).The lowest energy solution is called the ground state.Using SA, We got the same solutions as the linear solver with the following energies q(x, x) = −12.0 and q r (x, x) = −12.1.The energies correspond to the ground state as −12.5 is the offset (the constant term in the QUBO formulation ), and 0.5 and 0.4 are the optimal (lowest possible) penalties for delays.
Next, we solved the problem on D-Wave Advantage quantum processing unit (QPU) [10].In D-Wave Advantage QPU, not all the qubits are interconnected via couplers, and the underlying graph has the specific structure known as the Pegasus topology [40].Hence, before running a problem on the D-Wave, a procedure called minor embedding is required to map the logical variables to the physical qubits on the machine.Due to limited connectivity, a single logical qubit is often represented with a chain of physical qubits that are coupled strong enough so that they end up in the same value representing the same variable.The coupling between the qubits in the chain is known as the chain strength, and a low chain strength may result in chain breaks while a high chain strength may override the problem parameters.In our experiments on D-Wave Advantage, we used the default minor embedding algorithm provided by Ocean SDK and used various chain strengths.The number of logical variables is 176 and the number of physical qubits used in the machine after embedding is ∼ 900.For both problems the degree of completeness of the problem graph was approximately 0.1.
Another parameter that needs to be set is the annealing time.Annealing time depends on the problem and problem size and is also limited by the current technology of D-Wave Advantage QPU.In our experiments, the annealing time is set as 250µs.Results of the D-Wave experiments are presented in Fig. 6.A solution is feasible, if it can be technically realized on the railroad infrastructure, i.e., all hard constraints are fulfilled.A solution is optimal if the order of the trains on conflicting resources (i.e., tracks that are used by more than one train) is the same as the order in the ground state solution.We reached optimal solutions (in the sense of the train order) using the D-Wave machine, both for the default settings and rerouting.
For the default settings, D-Wave results for chain strength 4 are: t 1 = 4, t 2 = 8 (adding an additional 0.7 to the objective), t 3 = 9 (adding an additional 0.1 to the objective) and t * 1 = 10, see Tab. 4a.The solution is feasible, since j 2 leaves s 1 at t 2 = 8, late enough to have no conflict with j 1 which will leaves s 1 at t 1 = 4. Furthermore, j 2 will arrive to s 2 at t 2 + τ (pass) (j 2 , s 1 , s 2 ) = 8 + 8 = 16, i.e. after j 1 leaves s 2 at t 1 = 11.The order of trains is the same as in the optimal solution, and the energy of the state is −11.7.This energy does not correspond to the ground state as there are some additional delays of the trains, however, do not affect the feasibility and the order of trains.
For rerouting, the results of D-Wave are: t 1 = 6, (adding an additional 0.4 to the objective) t 2 = 4 (adding an additional 0.3 to the objective), t 3 = 13 (adding an additional 0.5 to the objective) and t * 1 = 11, see Tab. 4b.The solution is feasible, since j 2 will arrive to s 2 at t 2 + τ (pass) (j 2 , s 1 , s 2 ) = 4 + 8 = 12, i.e. after j 1 leaves s 2 at t * 1 = 11, and before j 3 leaves s 2 at  In the case of D-Wave Advantage, only one feasible solution was found for each panel.For chain strength 4.0 at each panel, the percentage of feasible solutions over total number of solutions is roughly 2.5 × 10 −4 .t 3 = 13.The order of trains is the same as in the optimal solution, the state energy is −11.3.Like in the default settings case, the found solution is feasible but not the ground state.Another alternative is to use the hybrid solver for binary quadratic models provided by D-Wave.The hybrid solver runs in parallel modules consisting of a heuristic classical component to explore the search space and a quantum component that makes queries to D-Wave QPU to guide the optimization process and improve the existing solutions.The best solution found among the parallel runs is returned to the user [41].Using the hybrid solver, we obtained the ground state, both in the case of default settings and the rerouted setting.
With our example, we have demonstrated that although it is possible to have the optimal solution for the D-Wave, it is not straightforward and requires at least an extensive parameter sweep.On the other hand, the D-Wave hybrid solver found the ground state on the first try.More importantly, the hybrid solver can be used for tackling larger problems as those solvers can work on problem instances with up to 20000 variables that are fully connected or up to 1

Assesment of solvers on larger instances
To demonstrate the feasibility of the hybrid solver, we have assessed both the D-Wave Advantage and the D-Wave hybrid solver on a bit larger examples.In both examples, we use the same parameters settings as in Sec.4.1 for the calculations.
The first is a bit enlarged default setting one with infrastructure as in Fig. 5. Here, in addition, train j 3 is followed by another stopping train j 4 , and the conflict occurs on the minimal headway between j 3 and j 4 .We call the problem the 4 trains 2 stations example.The problem is a bit larger with 187 logical variables.Although the number of connections is larger, the degree of completeness of the graph is a bit smaller and equals roughly 0.09.The ground state energy, consistent with the solution of the ILP, equals to q(x, x) = −14.4.
The second example concerns a larger number of trains and a larger number of stations on a more branched network.We call the problem the 5 trains 5 stations example.The problem is encoded on 341 logical variables, but with a much smaller degree of completeness of the graph which equals roughly 0.04.The ground state energy, consistent with the solution of the ILP, equals to q(x, x) = −21.49.
Results of calculations for both additional examples are presented in Fig. 7.As we can see for slightly larger problems than in Sec.4.1, the D-Wave Advantage does not give any feasible solution.The D-Wave hybrid solver, on the other hand, still has promising outcomes.Actual characteristics of the problem are presented in Tab. 5. Here, we have observed that the larger the railway problem is, the smaller the degree of completeness.This observation coincides with Tab.IV [16] and discussion in Sec. 3 as the number of variables and number of non-zero QUBO terms are roughly linear in the number of trains and stations.Referring to Fig. 6, Fig. 7, and Tab.IV in [16] we can generally conclude that smaller railway problems, with the graph's degree of completeness of 0.1 or larger, are solvable on the D-Wave machine without the need for the D-Wave hybrid solver.For larger problems, the hybrid solver is necessary.From a practical point of view, the above-presented problems are still of small size due to the small size of the current D-Wave machine.To estimate the amount of logical resources needed to solve real-life problems, let us consider an hour cycle on the dense traffic (one train per 2 min.in each direction) on the double-track metro line with 20 stations.(In an hour cycle, we have 60 trains.)We then consider d max = 5 minutes, and 1 minute resolution.According to Eq. ( 49), we would have roughly 7 200 variables.If each train is assumed to be in potential conflict with H(d max ) = 5 other trains (that many trains pass in 2d max = 10 min.interval), then according to Eq. (55) we will have roughly 216 000 auxiliary variables.(Obviously, in both cases, the particular number of variables depends on the details of the topology of the problem.)Such a problem would be solvable on the not very large device but with possible 3rd order connections or a much larger one with 2nd order connections only.

Conclusions and outlook
As current classical technologies are insufficient to compute feasible results in a reasonable time, fully developed quantum annealing offers potential polynomial speed-ups for these problems.However, switching from the conventional notation to the one demanded by the quantum annealer is a challenge.Our paper is the first to present the quadratic and higher-order unconstrained binary optimization representation for the railway rescheduling problem concerning the determination of the order of trains on conflicted resources, rescheduling, and rerouting trains on single-, double-and multi-track lines and stations.
The number of qubits is one of the bottlenecks of current quantum devices.It is thus desirable to use the smallest possible number of qubits when modeling.When quadratic and higher-order models are compared, the latter is more efficient in terms of the number of qubits required.Although currently, it is not possible to utilize HOBO with quantum annealers, the need for quantum annealers allowing such interactions is evident [43].There is also ongoing work for building architectures that allow solving optimization problems involving higher-order terms directly [44] in the gate-based model.Furthermore, algorithms like quantum approximate optimization algorithm (QAOA) [26] allow solving higher-order problems natively [13,14].
Four demonstrative problems were implemented on the current D-Wave machine.Two smaller problems were successfully solved both using the D-Wave Advantage QPU and using the D-Wave hybrid solver.Two larger problems were successfully solved only on D-Wave hybrid solver which we find promising for solving larger instances.Importantly, we have presented the HOBO/QUBO formulation that can be used with quantum-inspired architectures designed for solving combinatorial optimization problems stated in QUBO form such as Fujitsu digital annealers [45].
Determination of penalty values poses a challenge for solving QUBO problems in general.Although we have determined the penalty values using heuristic methods, note that there are some recent algorithms dedicated to penalty determination like the cross entropy optimization discussed in [46] and the one discussed in [47] (see.Eg.Section 3.2) is tested successfully on the particular Fujitsu digital annealer.
Curiosity arises on how quantum annealers or other Ising-based heuristics behave in solving real-life problems compared to conventional methods.Further research should be undertaken to explore the applicability of the presented approach for real-life train rerouting and rescheduling problems.In particular, when considering the railway traffic on the regional scale where delays can be large and the number of trains is not very large, the QUBO formulation that will be obtained from the ILP representation presented in this paper may be worth investigating.
Besides wide railway potential applications (ordinary railways, metro, trams), discussed rules of problem conversion into HOBO / QUBO can be applied generically in many branches of operational research.Let us list a few: 1. Electric bus scheduling, where the charging place occupation condition can be modeled in analogy to our track occupation condition.
2. Automated guided vehicle (AGV) scheduling in the factory, where there are many railway analogies.AGVs have a pre-designed schedule that is conflicted and needs to be rescheduled.AGVs follow the paths that are uni or bi-directional; hence, there is a headway and single track line condition.There are places that can be occupied by one AGV at a time (track occupation condition), paths of AGVs cross (switch condition), and there is the sequence of tasks for the given trolley (rolling stock circulation condition).Rerouting of AGVs can be treated as an extra task beyond the optimization as in Fig. 4., and finally, AGVs may have various priorities.
In general, our HOBO approach (generated by track occupation condition) may be applicable for models consisting of "stations" that can be occupied by only one "vehicle" at a time, with waiting possibilities on stations and no-waiting elsewhere.

S
Annealing time = 250 µs, 3996 reads D-Wave advantage non-feasible D-Wave advantage feasible Ground energy and D-Wave hybrid solver

Figure 6 :
Figure 6: Lowest energy solutions obtained from D-Wave Advantage and D-Wave hybrid solvers.In the case of D-Wave Advantage, only one feasible solution was found for each panel.For chain strength 4.0 at each panel, the percentage of feasible solutions over total number of solutions is roughly 2.5 × 10 −4 .

Figure 7 :
Figure 7: Lowest energy solutions obtained from D-Wave Advantage and D-Wave hybrid solvers for larger problems.We present also comparison with the ground state achieved from the ILP (classical) approach.

Table 2 :
List of the variables used in the paper.
min.j∈Jw j s∈S j t(j, s out ) − υ(j, s out ) d max (36) for each condition there are roughly |J ||S|(d max + 1)N (d max ) non-zero QUBO or HOBO terms.The exception is the rolling stock occupation condition, that occurs rather rarely in comparison with others.We expect roughly 3|J ||S|(d max + 1)N (d max ) non-zero QUBO terms, and |J ||S|(d max + 1)N (d max ) HOBO terms.In comparison with LIP(36), this still can be an efficient implantation provided d max is controlled.

Table 4 :
Solutions obtained from the D-Wave.

Table 5 :
Problem characteristics.The first 3 rows refer to the characteristics of the QUBO and the last one to the characteristics of the solver.Number of physical variables is computed via D-Wave's default embedding algorithm minorminer which is a heuristic algorithm, hence the data are approximate.