1 Introduction

Operations research is playing a considerable role in solving real-world problems in public transportation with specific problem settings becoming more and more integrated. A major breakthrough regarding visibility outside of the public transportation domain was the success of these approaches when being judged at the Edelman award more than 15 years ago (see Abbink et al. 2005). Ever since, more and more details have made problem settings richer and richer. While the connotation of rich problems is widely known in the vehicle-routing domain (cf. Hartl et al. 2006), we can also envisage them in public transport. Starting from the classical vehicle and crew scheduling problems, various extensions have been tackled in literature like legal constraints, fairness constraints, company-related governance rules etc. Examples that were striking our interest over time include rostering and days-off patterns in Mesquita et al. (2013) or the consideration of fixed buffer times and delay propagation in Amberg et al. (2019).

Over the years, we also saw major achievements of solution methodology in solving hard optimization problems in operations research and recently, we can find a wealth of general purpose solversFootnote 1 being ready to solve problems, once they have been “mipped”. That is, following Fischetti et al. (2009), it pays to formulate a problem by means of a mathematical model like a mixed-integer programming (MIP) model and then solve it using related software (including those general solvers as well as especially tailored matheuristics and metaheuristics). Moreover, even classical methods like Benders decomposition can now be enhanced in a matheuristic fashion and considerably be improved (see, e.g., Caserta and Voß 2021).

In the light of all these improvements, we are asking the following research questions:

  • R1: Is it possible to solve integrated problems in public transport, that had to be tackled by means of specialized heuristics years ago due to their inherent problem complexity, by means of currently available standard solvers and, if so, which instance sizes are to be solved in time limits deemed practical?

  • R2: Is it possible to integrate additional problem features in given vehicle and crew scheduling problems to make them richer, without fully discarding the solvability by means of standard solvers?

To answer these research questions, we go back to earlier solution approaches in integrated vehicle and crew scheduling problems. After briefly surveying this problem domain in Sect. 2, we single out two approaches from the related literature in Sect. 3, namely the above mentioned references of Mesquita et al. (2013) and Amberg et al. (2019). After having investigated the settings of Mesquita et al. (2013) first, Sects. 3.2 and 3.3 investigate problem extensions to make those already rich problems a little richer and ask the same research questions again. The final section concludes and offers ideas for future research.

2 Literature review

Planning problems in public transport are usually classified into different types of problems depending on which impact they have, i.e., strategic, tactical and operational. We assume that the reader is acquainted with this in general terms (see, e.g., Ceder 2015; Daduna and Voß 2000; Desaulniers and Hickman 2007; Vuchic 2005).

First, we consider issues in public transport with a focus on vehicle and crew scheduling. Then we turn towards more general issues in mixed-integer programming.

2.1 Integrated vehicle and crew scheduling

According to Desaulniers and Hickman (2007, p. 104/105), the integrated vehicle and duty scheduling problem (IVDSP) is defined as follows. “Given a set of timetabled trips and a fleet of vehicles assigned to several depots, find minimum-cost vehicle blocks and valid driver duties such that each active trip is covered by one block, each active trip segment is covered by one duty, and each deadhead, pull-in, and pull-out trip (hereafter called an inactive trip) used in the vehicle schedule is also covered by one duty”. Assumptions usually include that blocks need to start and end at the same depot, several work rules have to be considered, drivers are originating also from one depot, and possibly many additional ones. Figure 1 aims to clarify this situation in an extended setting for a single depot and four trips \(t_1, t_2, t_3, t_4\). Assigning duties to drivers is usually called rostering.

Fig. 1
figure 1

Time-space network example with four service trips from Amberg et al. (2019, p. 95)

Vehicle and crew scheduling in public transport are usually solved as tactical and operational problems and this is often done in a hierarchical fashion, mostly due to the inherent difficulty of the problems (see, e.g., Freling et al. 1999; Desaulniers and Hickman 2007; Perumal et al. 2020) for surveys over time and the embedding of this area into operations research and related planning and operations.

Usually, vehicle and crew scheduling have been solved in a hierarchical or an integrated way. The hierarchical way can also be called sequential, where vehicle schedules are specified first before crew schedules are determined. That is, for existing lines, trips are defined for which vehicles are scheduled and then crews are assigned to duties covering these trips. An additional common distinction accounts for distinguishing the single-depot and the multi-depot case. Early references for integrated problem settings are some PhD-theses on the topic (see, e.g., Freling 1997) and related papers (see, e.g., Freling et al. 2001; Haase et al. 2001; Huisman et al. 2005).

From the recent survey of Perumal et al. (2020) we can make an interesting observation which is that in the 10-year period from 2009/2010 there are not so many works any more on the integrated vehicle and crew scheduling problem. Exceptions are to some extent those that try to emphasize practical settings as, e.g., Mesquita et al. (2013); Er-Rbib et al. (2021b) for rostering (both with days-off patterns). Only recently, the interest in these integrated problems became rejuvenated with the focus of making the problem settings richer. In this respect, a few issues of interest within public transport refer to “complications” of the problem settings. One of these issues relates to the inclusion of buffer times. This can be done upfront by simple inclusion of extra time, often known as schedule padding (see, e.g., Wessel and Widener 2017), or by means of calculations with fixed or variable buffer times (see, e.g., Swierstra et al. 2017). Examples for the latter include delay propagation by Amberg et al. (2019). New ideas and attempts have been motivated especially by new technologies like electric buses (cf. Perumal et al. 2020) or by means of incorporating issues regarding disturbances and robustness (cf. Borndörfer et al. 2010; Ge et al. 2020). Fairness settings regarding workload balancing are considered (e.g., by Xie and Suhl 2015; Er-Rbib et al. 2021a).

While integrated vehicle and crew scheduling is a topic with a great interest in public transport, it can also be found in related areas. Starting from the vehicle routing problem and opening up also to logistics, we can see an explosion of various integrated settings. In logistics the integration may include the joint movement of various transport modes (like a vehicle with a driver and an unmanned drone); see, e.g., Otto et al. (2018) for a survey on this type of problems. Starting with the classical vehicle routing problem, Lam et al. (2020) relax the usual assumption that only one crew operates a vehicle in that problem setting. With a focus on humanitarian and military logistics, the authors formulate in mathematical terms combining MIP approaches with constraint programming. In the same spirit, we can find crew rostering problems with added richness in the airline industry (see, e.g., Doi et al. 2018). Integrated vehicle and crew scheduling in the maritime industry is rather scarce (where vehicles are assumed to be vessels). For an exception see Ang Pik Yoke et al. (2021). In all these cases data availability seems a major issue; for a recent survey regarding public transport see Ge et al. (2021).

Mathematical modeling attempts for integrated vehicle and crew scheduling are mostly utilizing either a network-flow formulation or the use of the set partitioning or set covering problem as an underlying problem structure. Most early approaches for the exact solution of integrated problem settings were based on these settings by applying branch-and-bound or branch-and-cut approaches using multi-commodity flow or set partitioning- and set covering-based modeling approaches (Desaulniers and Hickman 2007; Perumal et al. 2020). Advances in this respect are still observed over time (see, e.g., Tahir et al. 2019), who investigate a set partitioning-based approach for vehicle scheduling. Using dynamic programming and an adjacency-based algorithm, recently Himmich et al. (2020) propose a primal column generation framework. Moreover, various types of heuristics and metaheuristics have been tried, including even especially tailored Benders-based heuristics (like the one of Mesquita et al. 2013).

As an interlude we should state that typical problem sizes in literature are largely varying. Shen and Xia (2009) solve data from the Beijing Bus group using an iterative sequential heuristic algorithm and they can find feasible solutions for instances up to 107 buses and 164 duties. Mesquita et al. (2011) use data from a bus company in Lisbon to demonstrate their preemptive goal programming-based heuristic approach while they solve the integrated problem two years later (Mesquita et al. 2013) with a Benders decomposition approach. All three papers evaluate their proposed algorithms on real-world instances, but the timetabled trips are limited to 238. To be more specific, we like to re-iterate the details of the instance sizes going up to 238 trips. The first set of data comes from different bus lines from Lisbon, Portugal, with 122, 168, 224, 226 and 238 timetabled trips for weekdays and vehicles from four depots. The second set comes from Porto, Portugal, with 108, 156, 250 and 280 timetabled trips for weekdays and vehicles from one depot. The results reported in Mesquita et al. (2013) also incorporate those for selected data from the repository of Huisman (2007) (the repository includes instances with up to 400 trips). The data used incorporate start and end times as well as start and end locations of all trips for instances with 80 and 200 trips, respectively, involving four depots (for weekdays). Weekend trips were randomly selected, with a probability of 0.4 from the respective non-weekend daily trip sets. Below, we shall show that we can solve even larger instances to optimality.

Our approach in this paper is to revisit earlier formulations from literature. That is, we repeat a given formulation, focus on its solvability under today’s standard solver conditions, and then show that several as yet unexplored extensions to that formulation are within the same range regarding solvability.

Looking at possible extensions from literature, we like to point to very recent ones that are difficult to integrate even with today’s techniques. As an example, consider the work of Er-Rbib et al. (2021b). For a given set of predefined duties and groups of drivers, they define the duty assignment problem with group-based driver preferences as the problem of building rosters that cover all the duties over a predetermined cyclic horizon while respecting a set of rules (hard constraints), balancing the workload between the drivers and satisfying as much as possible the driver preferences (soft constraints). In a sense this may be referred to as a problem setting seeking fairness among the drivers. The problem also considers fixed days off (which relates it to the settings of Mesquita et al. 2013). Their mathematical programming formulation allows to optimally solve instances with up to 124 drivers and 490 duties while keeping a near-optimal workload balance. Larger instances with up to 333 drivers and 1509 duties are treated with a newly developed matheuristic, though without guaranteeing optimality for the related results. A different set of constraints to be integrated stems from Perumal et al. (2021). Given a set of timetabled bus trips, they search for a driver schedule covering a given set of trips obeying various labor union regulations (especially refering to breaks for the drivers). In addition, drivers possibly need to travel by separate vehicles (called staff cars) between bus stops to have their breaks. The simultaneous scheduling of drivers and staff cars for the drivers is considered as the driver scheduling problem with staff cars. The authors consider several instances of different sizes from a number of European service providers (with a nondisclosure agreement so that the detailed data are not available to readers). The number of trips varies from 43 to 1926 and the number of staff cars ranging within 0 to 6, but they can be as large as 15 for one of the instances. While the small instances can be solved to optimality using the MIP model, most larger ones are out of reach for the model and a matheuristic is used (using the mathematical programming formulation and an adaptive large neighbourhood search).

2.2 Mixed-integer programming and replication studies

Many of the advances in mathematical programming utilized in general, i.e., not only in integrated vehicle and crew scheduling, come from the advances of information and communication technologies as well as general purpose solvers like CPLEX and Gurobi. This is also in line with the advances in general MIP solving. An entry point into these advances may be given using Gleixner et al. (2021) and Mittelmann (2020). A comparative analysis of different solvers can be found, e.g., in Anand et al. (2017) and within the links provided in Mittelmann (2020).

Next, we like to mention an area that has not yet comprehensively been combined with public transport research (if at all), namely replication studies. First of all, for many mathematical programming formulations in combinatorial optimization, like for the traveling salesman problem, it is well established to re-use existing formulations to get new insights on the specific problem under consideration. Within the social sciences, e.g., Hüffmeier et al. (2016) argue that replications may have a lot to offer beyond what is already known, especially if an improved conceptualization is available. This may be appended if new or improved technology or algorithmic advances have appeared over time. Related arguments can be found in various areas including management (see, e.g., Block and Kuckertz 2018; Boylan 2016). Putting this into perspective, we can argue that modern information and communication technology, standard solver technology as well as the upcoming use of matheuristics has made the case also for replication studies, e.g. using existing mathematical programming formulations in public transport. Even in the field of software engineering and metaheuristics this is mostly unexplored (see, e.g., Kendall et al. 2016; Swan et al. 2022). A starting point within software engineering could even be the idea of back-to-back testing (see, e.g., Jörges and Steffen 2014).

To the best of our knowledge, a general framework for replication studies in mathematical programming does not yet exist. There are a few thoughts that may be used to provide some pointers to related literature. Coming from software engineering and algorithm design as well as heuristics or metaheuristics, re-use needs to be preplanned (see, e.g., Fink and Voß 2002 with a focus on metaheuristics class libraries). In a sense, this applies to solvers as well as methods themselves. Over time, many algorithmic approaches have been published without always being rigorous or standing the test of novelty (see, e.g., Sörensen et al. 2019 for a specific example and de Armas et al. 2021 for the proposal of a template in the area of metaheuristics). Differently, this topic relates to replicating studies performed on developed models in mathematical programming, discrete event simulation, etc. (see, e.g., the discussion in Taylor et al. 2018). It seems still seldom that a repository is set up with the underlying data and models as well as algorithms being provided for readers. And if it is (like in case of a heuristic optimization framework from the above-mentioned Fink and Voß 2002), then one often finds counter-arguments regarding its use related to a “not-invented-here syndrome” (see, e.g., Nissen 2018).

As this has not yet been extensively studied, we just provide some hint on where this could go, e.g., in the field of metaheuristics and evolutionary programming like López-Ibáñez et al. (2021), Swan et al. (2022) and de Armas et al. (2021). Within mathematical programming the arguments used when setting up the MIPLIB library, including quite a few instances from our scheduling domain, should be used as a starting point (see Koch et al. 2011; Gleixner et al. 2021) even if some countermeasures exist (see Mittelmann 2020). A clear-cut standpoint might be to provide at least information about the computing environment, if separate programming is needed, the programming language, information about the used software or solver, the model itself, the used data, etc.

Supporting arguments can be found in the works around erraticism, i.e., replicating existing studies with different seeds for random number generators as proposed by Fischetti and Monaci (2014). The idea is to replicate numerical results with the same model and just different seed generators. Extending this, Voß and Lalla-Ruiz (2016) reformulate the multiple-choice multidimensional knapsack problem as a generalized set partitioning problem with the results of obtaining various new best known results for one of the most studied problems in combinatorial optimization. And, just to support the case, they even show that the heuristic cut generation within CPLEX and Gurobi may lead to different new best results. Extending this may lead to the idea of utilizing redundant constraints to strategically influence the success of the solver, as shown in Lalla-Ruiz and Voß (2016). On a similar scale, one may also consider the same type of model, the same type of algorithm and have just some slight modification. As an example with an application towards integrated vehicle and crew scheduling we refer to a modified branching strategy within branch-and-bound proposed in Borndörfer et al. (2013), named rapid branching.

3 Selected problem setting(s) and solvability

As we have seen in the previous section, there are quite a few works emphasizing integrated vehicle and crew scheduling. As examples incorporating a certain type of richness, we resort to the work of Mesquita et al. (2013). Regarding our research questions it seems most suitable due to its age and the related innovation of that time (applying an advanced Benders heuristic based on an appropriate mathematical modeling approach). That is, the time between their exposition and now resorts in a perfect way towards our research questions. Moreover, in the spirit of considering (“revisiting”) a certain type of richness, this work incorporates some legal constraints as well as days-off patterns. In addition, buffer times as well as robustness issues are treated.

3.1 The model of Mesquita et al.

Let us consider the integrated vehicle and crew scheduling or rostering problem (i.e. minimum cost vehicle and daily crew schedules that cover all timetabled trips and a minimum cost roster covering all daily crew duties according to a pre-defined days-off pattern) as provided in Mesquita et al. (2013), and let us start with some notation following that reference. Firstly, we use \(N^{h}\) to indicate the set of trips to be performed on a specific day h with a planning horizon of H. Secondly, binary variables \(z^{dh}_{ij}\) indicate whether a vehicle from a specific depot d (out of a given set of depots \({\mathcal {D}}\)) performs trip i immediately before trip j. Moreover, \(I^{h}\) indicates the set of pairs of compatible trips (ij), i.e., any two trips i and j which can be performed immediately one after another. These are regular trips as well as those coming from a depot (called pull-out trips; see also Fig. 1) or going into a depot (called pull-in trips). The cost values to be considered for trips i and j to be performed in immediate succession are \(c^{d}_{ij}\). Note that Mesquita et al. (2013) do not distinguish these values for specific days (like weekdays versus weekends or holidays). Moreover, pull-in trips are implicitly considered due to the nature of the used flow conservation constraints (see (3) in the model below). Additional data include \(v_{d}\) as the number of vehicles available at depot \(d \in {\mathcal {D}}\). Moreover, we define a set of pull-in trips as \({\mathcal {D}}'\).

\(L_{ij}^{h}\) denotes the set of crew duties possibly covering task (ij) on day h, where this is a subset of the set \(L^{h}\) being the set of all crew duties of a specific day h. Each possible crew duty l incurs a cost value of \(e_{l}\). Variables \(w_{l}^{h}\) are related variables indicating whether crew duty l is selected on day h. Each driver m belongs to a set of drivers M. With that, binary variables \(y_{l}^{mh}\) may be defined indicating whether driver m performs a certain crew duty l on day h. To perform a duty, a driver m must be available and scheduled according to a certain schedule \(s \in S\) which defines his / her availability during certain days, hours, etc. This even may incorporate certain duties that are differentiated from regular duties, i.e., short duties of at most 5 h without break and long duties with overtime beyond 9 h. That is, S indicates the set of possible schedules or schedule variations (like a regular day shift on a weekday, on a weekend etc.). Binary variables \(x_{s}^{m}\) together with cost values \(r^{m}\) are used to display this. In Mesquita et al. (2013) this is also emphasized by means of a binary parameter \(a_{s}^{h}\) stating whether a certain day h is included in schedule (or schedule type) s. Variables \(\eta _{T}\) and \(\eta _{O}\) are used to account for the used numbers of short and long duties, respectively, incurring penalties \(\lambda _{T}\) and \(\lambda _{O}\) if they are used. Short duties are those up to 5 h and long duties are those of more than 9 h;Footnote 2 all others are called normal duties. Note that these numbers are bounded from above by means of the data of each specific problem instance. Set \(L^{h}\) may have related subsets defined, \(L^{h}_{T}\) for short duties, \(L^{h}_{O}\) for long duties, and \(L^{h}_{N}\) for those normal duties without being specified as being either short or long. Without having to pay extra costs, one may also distinguish different duties starting in the first or the latter part of a day, say, a set \(L^{h}_{E}\) indicating those which start any time in the morning up to 3:30 pm (they, e.g., need to incorporate a lunch break for drivers) and a set \(L^{h}_{A}\) starting at 3:30 pm or later.

To ease with the notation, we repeat the complete notation as follows. Note that we already include some notation that is going to be used later in the attempt to provide a formulation incorporating additional means of richness (regarding delay propagation).

Parameters

\(c^d_{ij}\)

The cost of the deadhead trip from i to j

\(e_l\)

The duty cost of l

\(L^h, L^h_E, L^h_A, L^h_T, L^h_N, L^h_O\)

The set of crew duties, the set of early duties, the set of late duties, the set of short duties, the set of normal duties, the set of long duties

\(L^h_{ij}\subset L^h\)

The set of crew duties covering task (i,j) on day h

M

The set of drivers

D

The set of depots

\(N^h\)

The set of trips to be performed on day h

\(v_d\)

The number of vehicles available at d

\(a^h_s\)

=1 if h is a workday on schedule s, 0 otherwise

\(r^m\)

The assignment cost of driver m to a schedule

\(\sigma ^d_k\)

The delay associated with duty k if it is done with the vehicles of depot d

\(\lambda _T\)

The penalty for short driver duties

\(\lambda _O\)

The penalty for long driver duties

\(\alpha _V\)

The coefficient for the effect of delay propagation in terms of vehicle tasks

\(\alpha _C\)

The coefficient for the effect of delay propagation in terms of crew duties

TK

The set of all duties in the chronological order of their execution time, \(TK=\{1,2,\ldots ,|TK|\}\)

\(K^C_{ij}\)

The set of crew duties that can serve (ij), \(K^C_{ij} \subset TK \)

\(K^V_{ij}\)

The set of vehicle duties that cover the trip associated with (ij), \(K^V_{ij} \subset TK \)

\(ED_k\)

Expected delay of duty k

\(BT_k\)

The buffer time before the execution time of k

Variables

\(z^{dh}_{ij}\)

=1 if a vehicle from depot d performs trips i and j in sequence on day h

\(z^{dh}_{dj}, z^{dh}_{id}\)

The pull-out from d to trip j and the pull-in from i to depot d

\(w^h_l\)

If crew duty l is selected on day h

\(x^m_s\)

If driver m is assigned to schedule s

\(y^{mh}_l\)

If driver m performs crew duty l on day h

\(\eta _T, \eta _O\)

The maximum number of short, the maximum number of long duties assigned to a driver during H

\(DP_k\)

Delay propagation up to duty k

\(ATD^d_k\)

The actual total delay of duty k regarding the vehicles of depot d

\(p^d_k\)

Binary variable=1, if crew duty k is selected (using depot d)

\(q^d_k\)

Binary variable=1, if vehicle duty k is selected (using depot d)

Now, we can model as follows:

$$\begin{aligned}&\min&\sum _{h \in H} \left\{ \sum _{d \in {\mathcal {D}}} \sum _{i: (i,j) \in I^{h}} c^{d}_{ij} z^{dh}_{ij} + \sum _{l \in L^{h}} e_{l} w_{l}^{h}\right\} + \sum _{m \in M} \sum _{s \in S} r^{m} x^{m}_{s} + \lambda _{T} \eta _{T} + \lambda _{O} \eta _{O} \nonumber \\&\end{aligned}$$
(1)
$$\begin{aligned}&\text{ s.t. }&\sum _{d \in D} \sum _{i: (i,j) \in I^{h}} z^{dh}_{ij} = 1 \quad \quad \forall j \in N^{h} - {\mathcal {D}}, \forall h \in H \end{aligned}$$
(2)
$$\begin{aligned}&\sum _{i: (i,j) \in I^{h}} z^{dh}_{ij} - \sum _{i: (j,i) \in I^{h}} z^{dh}_{ji} = 0 \quad \quad \forall j \in N^{h}, \forall d \in {\mathcal {D}}, \forall h \in H \end{aligned}$$
(3)
$$\begin{aligned}&\sum _{i \in N^{h}} z^{dh}_{di} \le v_{d} \quad \quad \forall d \in {\mathcal {D}}, \forall h \in H \end{aligned}$$
(4)
$$\begin{aligned}&\sum _{l \in L_{ij}^{h}} w_{l}^{h} - \sum _{d \in {\mathcal {D}}} z^{dh}_{ij} \ge 0 \quad \quad \forall (i,j) \in I^{h}, \forall h \in H \end{aligned}$$
(5)
$$\begin{aligned}&\sum _{m \in M} y_{l}^{mh} - w^{h}_{l} = 0 \quad \quad \forall l \in L^{h}, \forall h \in H \end{aligned}$$
(6)
$$\begin{aligned}&\sum _{s \in S} x_{s}^{m} \le 1 \quad \quad \forall m \in M \end{aligned}$$
(7)
$$\begin{aligned}&\sum _{l \in L^{h}} y_{l}^{mh} - \sum _{s \in S} a_{s}^{h} x_{s}^{m} \le 0 \quad \quad \forall m \in M, \forall h \in H \end{aligned}$$
(8)
$$\begin{aligned}&\sum _{l\in L^{h}_{f}} y_{l}^{mh} + \sum _{l\in L^{h-1}_{g}} y_{l}^{m(h-1)} \le 1 \quad \quad \forall m \in M, \forall h \in H - \lbrace 1 \rbrace , f \ne g \in \lbrace E,A \rbrace \nonumber \\&\end{aligned}$$
(9)
$$\begin{aligned}&\sum _{h \in H} \sum _{l\in L^{h}_{t}} y_{l}^{mh} - \eta _{t} \le 0 \quad \quad \forall m \in M, t \in {T,O} \end{aligned}$$
(10)
$$\begin{aligned}&z_{ij}^{dh} \in \lbrace 0, 1 \rbrace \quad \quad \forall (i,j) \in I^{h}, \forall d \in {\mathcal {D}}, \forall h \in H \end{aligned}$$
(11)
$$\begin{aligned}&w_{l}^{h} \in \lbrace 0, 1 \rbrace \quad \quad \forall l \in L^{h}, \forall h \in H \end{aligned}$$
(12)
$$\begin{aligned}&y_{l}^{mh} \in \lbrace 0, 1 \rbrace \quad \quad \forall l \in L^{h}, \forall m \in M, \forall h \in H \end{aligned}$$
(13)
$$\begin{aligned}&x_{s}^{m} \in \lbrace 0, 1 \rbrace \quad \quad \forall s \in S, \forall m \in M \end{aligned}$$
(14)
$$\begin{aligned}&\eta _{T}, \eta _{O} \in {\mathbb {N}}_{0} \end{aligned}$$
(15)

The objective function (1) measures the quality of the solution in terms of vehicle and driver costs and roster balancing (i.e., the third term in the objective is the total assignment cost of drivers to schedules), as well as penalties for short and long duties. Explaining the constraints of the model could start from clarifying that Equalities (2) and (3) indicate a vehicle scheduling problem where the first set of equalities states that each timetabled trip has to be done exactly once by means of choosing one of the vehicles from the same depot having done an immediately preceding trip. Each depot has only a limited number of vehicles available, indicated in (4). Constraints (5) guarantee that each task in a vehicle schedule is covered by at least one crew duty, i.e., here we have the coupling of vehicle and crew duty variables. Equalities (6) ensure that each crew duty of a solution to the problem must be assigned to a driver. Constraints (7) guarantee that each driver is assigned to at most one certain schedule or a certain service. The model also incorporates a few important coupling constraints. Constraints (8) are used to link the assignment of a crew duty and a specific schedule (type) assigned to a driver. Constraints (9) are intended to forbid that an early duty follows immediately after a late duty.

Regarding solvability of the model by means of common solvers, we proceed as follows. The model is programmed in GAMS (with CPLEX as the underlying solver) and for the first set of experiments, its input data is randomly generated there. The input generation rules are summarized in Table 1. The computing environment uses a standard PC / laptop (Intel(R) CoreTM i7-6700HQ with 2.60 GHz and 32 GB RAM, 64-Bit-operating system). All models and data of this paper are available from the authors upon request.

Table 1 Input data generation

The set of crew duties (\(L^h\)), the set of early duties (\(L^h_E\)), the set of late duties (\(L^h_A\)), the set of short duties (\(L^h_T\)), the set of normal duties (\(L^h_N\)), and the set of long duties (\(L^h_O\)) are randomly chosen from the set of all duties.

Instances with different numbers of planning days, duties, depots, trips and drivers are generated as indicated in the next few figures. To generate the number of available vehicles at each depot (\(v_d\)), for small instances with enough depots, \(v_d\) is randomly chosen in the interval [10, 30]. However, if the instance does not contain enough depots to use this method, then the vehicles are equally divided between them. Here, instances with very large numbers of vehicles are built to examine the effect of this factor on the solver ability or the computation time. Figure 2 shows the execution time of solving the problem with CPLEX. It seems evident that the elapsed time increases drastically as the set of crew duties is enlarged. Likewise, Fig. 3 illustrates the execution time versus the number of vehicles. The resulting objective function values with increasing numbers of crew duties and vehicles are depicted in Figs. 4 and 5, respectively.

Fig. 2
figure 2

CPU-times for modified numbers of duties

Fig. 3
figure 3

Execution times (s) for modified numbers of vehicles

Fig. 4
figure 4

Objective function values for increasing numbers of duties

Fig. 5
figure 5

Objective function values for modified numbers of vehicles

In the following, the model is once again solved with its input data provided by Huisman (2007) (regarding the vehicle and crew scheduling part). This is in line with the data generation for some of the instances generated in Mesquita et al. (2013).Footnote 3 The numbers of available vehicles at the depots \(v_d\) and the \(c^d_{ij}\) values are provided there for different numbers of depots (up to 10) as well as different numbers of trips (up to 3000). The remaining input is generated like before. Figures 6 and 7 show the change in the required execution time for the different numbers of depots and trips. The corresponding objective function values are shown in Figs. 8 and 9. Here, a considerable growth in the execution time by larger sets of depots and trips is observable.

Fig. 6
figure 6

CPU-times for modified numbers of depots

Fig. 7
figure 7

CPU-times for modified numbers of trips

Fig. 8
figure 8

Objective function values for modified numbers of depots

Fig. 9
figure 9

Objective function values for modified numbers of trips

The most important result, however, seems a positive answer to our research question R1 as we are able to solve related instances to optimality with a standard solver within time limits deemed practical rather than using the specialized (heuristic Benders) approach from Mesquita et al. (2013).

3.2 Extending to include buffer times

To focus on research question R2, we investigate the use of buffer times and delay propagation as presented in Amberg et al. (2019). Their model is flow-based. For each arc (ij), flow variables \(f_{ij}^{d}\) are considered indicating whether the arc (ij) coming from depot d is used in the solution. Cost values \(c_{ij}^{d}\) with these variables are related to the (variable) costs of using a vehicle from depot d serving a specific arc (ij). If the model is assumed to be circular, fixed costs can be associated with the respective circulation arc; see Fig. 1. Delay propagation in Amberg et al. (2019) follows fixed buffer times or a calculated measure that represents the possible propagation of delays. Given a duty k with a set of trips to be performed, a measure \(r_k\) is defined incorporating expected “primary” delays and subsequent secondary delays (\(k \in {\mathcal {K}}\) is supposed to perform a set of \(T_{k} \subset {\mathcal {T}}'\) of trips / tasks) as follows:

$$\begin{aligned} r_{k}&= \sum _{i = 1}^{\vert T_{k} \vert - 1} p(t_{i},t_{i+1}) \\ p(t_{i},t_{i+1})&= \max \lbrace 0, PAT(t_{i}) + PD(t_{i}) + p(t_{i-1},t_{i}) - PDT(t_{i+1}) \rbrace \quad \quad \forall k \in {\mathcal {K}} \nonumber \end{aligned}$$
(16)

where an index k is omitted where possible; \(p(t_{0},t_{1})\) indicates the initial condition, i.e., the initial delay, say, of arriving at the start of a trip from a pull-out trip, \(PAT(t_{i})\) is the planned finishing time of trip or task \(t_{i}\), \(PD(t_{i})\) is the expected primary delay of \(t_{i}\) and \(PDT(t_{i+1})\) the planned departure time of the next trip \(t_{i+1}\).

It should be noted that delay propagation of vehicles and/or crew duties is considered to be a measure of robustness (cf. Ge et al. 2020). In the spirit of general key performance indicators one may also ask to which extent robustness measures focusing on other stakeholders like passengers (see the above short discussion) may be incorporated into the model (beyond what is part of delay management).

Next, the integrated model of Mesquita et al. (2013) is enriched by adding some real elements coming from Amberg et al. (2019). The added elements provide the opportunity to embed buffer times between consecutive vehicle trips and delay propagation into the model.Footnote 4

The formulation of this model is as follows:

$$\begin{aligned}&min \sum _{h \in H}\left( \sum _{d \in D}\sum _{(i,j)\in I^h} c^h_{ij} z^{dh}_{ij}+\sum _{l \in L^h} e_l w^h_l\right) + \sum _{m \in M}\sum _{s\in S} r^m x^m_s+\lambda _T \eta _T+ \lambda _O \eta _O \nonumber \\&\quad +\alpha _V \sum _{d\in D}\sum _{k\in K^d_V} ATD^d_k p^d_k+ \alpha _C \sum _{d\in D}\sum _{k\in K^d_C} ATD^d_k q^d_k \end{aligned}$$
(17)
$$\begin{aligned} \sum _{d\in D_i} \sum _{i: (i,j)\in I^h} z^{dh}_{ij}=1 \qquad j\in N^h-D, \ h\in H \end{aligned}$$
(18)
$$\begin{aligned} \sum _{i: (i,j)\in I^h} z^{dh}_{ij}- \sum _{i: (j,i)\in I^h} z^{dh}_{ji}=0 \qquad j\in N^h, d\in D, h\in H \end{aligned}$$
(19)
$$\begin{aligned} \sum _{i \in N^h} z^{dh}_{di} \le v_d \qquad d\in D, h\in H \end{aligned}$$
(20)
$$\begin{aligned} \sum _{l \in L^h_{ij}} w^h_l-\sum _{d\in D}z^{dh}_{ij}\ge 0 \qquad (i,j)\in I^h, \ h\in H \end{aligned}$$
(21)
$$\begin{aligned} \sum _{m \in M} y^{mh}_l-w^h_l=0 \qquad l\in L^h, \ h\in H \end{aligned}$$
(22)
$$\begin{aligned} \sum _{s \in S} x^m_s \le 1 \qquad m\in M \end{aligned}$$
(23)
$$\begin{aligned} \sum _{l \in L^h} y^{mh}_l-\sum _{s\in S} a^h_s x^m_s \le 0 \qquad m\in M, \ h\in H \end{aligned}$$
(24)
$$\begin{aligned} \sum _{l\in L^h_f} y^{mh}_l+ \sum _{l\in L^{h-1}_l} y^{m(h-1)} \le 1 \qquad \ m\in M, h\in H-\{1\}, \ f\ne g \in \{E,A\} \end{aligned}$$
(25)
$$\begin{aligned} \sum _{h\in H}\sum _{l\in L^h_t} y^{mh}_l-\eta _t \le 0 \qquad m\in M, \ t\in \{T,O\} \end{aligned}$$
(26)
$$\begin{aligned} \sum _{k \in K^C_{ij}} p^d_k-z^{dh}_{ij}=0 \qquad \forall d\in D, h\in H, \forall (i,j)\in I^h \end{aligned}$$
(27)
$$\begin{aligned} \sum _{k \in K^V_{ij}} q^d_k-z^{dh}_{ij}=0 \qquad \forall d\in D, h\in H, \forall (i,j)\in I^h \end{aligned}$$
(28)
$$\begin{aligned} DP_k=ED_k+\sum _{k'=1}^{k-1} p^d_{k'} \sigma ^d_{k'} \qquad d\in D, k \in K^C_{ij} \cup K^V_{ij} \end{aligned}$$
(29)
$$\begin{aligned} ATD^d_k=max(0, DP_k-BT_k) \qquad d\in D, k \in K^C_{ij} \cup K^V_{ij} \end{aligned}$$
(30)
$$\begin{aligned} z^{dh}_{ij} \in \{0,1\} \qquad \ (i,j)\in I^h, \ d\in D, \ h\in H \end{aligned}$$
(31)
$$\begin{aligned} w^h_l \in \{0,1\} \qquad l\in L^h, \ m\in M, \ h\in H \end{aligned}$$
(32)
$$\begin{aligned} w^h_l\in \{0,1\} \qquad \ l\in L^h, \ h\in H \end{aligned}$$
(33)
$$\begin{aligned} y^{mh}_l \in \{0,1\} \qquad \ l\in L^h, \ m\in M,\ h\in H \end{aligned}$$
(34)
$$\begin{aligned} x^m_s \in \{0,1\} \qquad \ s\in S, \ m\in M \end{aligned}$$
(35)
$$\begin{aligned} \eta _T, \eta _O \ge 0 {\text { and integer}} \end{aligned}$$
(36)
$$\begin{aligned} p^d_k\in \{0,1\} \qquad \forall d\in D, \; \forall k\in K^d_V \end{aligned}$$
(37)
$$\begin{aligned} q^d_k\in \{0,1\} \qquad \forall d\in D, \; \forall k\in K^d_C \end{aligned}$$
(38)

The new objective function (17) measures the quality of the solution in terms of vehicle and driver costs, roster balancing, total buffer times, and delay propagation costs regarding the vehicle tasks and crew duties. Then constraints (2)–(10) are repeated as constraints (18)–(26) followed by the inclusion of the buffer times as well as the definition of the variable ranges.

Constraints (27) and (28) enforce that \(p^d_k\) and \(q^d_k\) must be equal to 1 for one k if \(z^{dh}_{ij}=1\). Constraint (29) sets the delay propagation until duty k as its expected delay plus the sum of all the actual delays of the previous duties. The actual delay of duty k as the maximum of zero and the propagated delay until then minus buffer time is calculated by constraint (30). The remaining constraints define the variable ranges.

To conduct numerical results, instances are generated as in the previous subsection with some necessary extensions regarding the problem modification. That is, the input data of the model is either randomly generated as shown in Table 2 or using the data provided in Huisman (2007) with the remaining data randomly generated. It is worth noting that we assume that the larger buffer times are corresponding to busier sections of the timetable.

Table 2 The generation rules of the input data

Again, the model is programmed in GAMS with the CPLEX solver in standard mode on a computer as indicated above. For each instance size, ten instances are generated, five of them with totally random data and five based on the data from Huisman (2007). Whenever results are shown, the average of this sample is referred to.

Figures 10, 11, 12, and 13 depict the required execution times of the model with an increasing number of duties, vehicles, trips, and depots, respectively, once in its pure form and once by including fixed buffer times between the trips as well as delay propagation. In each case, the other parameters are set at their middle value. We set a time limit of 2 h or 7200 s for the solver. It means that whenever this limit is reached, the solution process stops. This time limit is shown in red in the figures.

As it is evident, adding the complementary elements to the model does not increase the execution time considerably. So, the model of Mesquita et al. (2013) can be easily enriched without being concerned about the solvability. By means of these experiments and the depicted results, we can positively answer the main question of this research (research question R2), which is about the solvability of this model by a standard exact solver. Although it is observed that the computation time grows considerably as any of the parameters increases, they can be at or near values known from practical settings. The interesting fact is the different trend observed in Fig. 13, which shows a substantial difference between the two models. This is due to the fact that the number of depots contributes to many constraint blocks. Therefore, increasing this parameter raises the total number of constraints in the model, which considerably slows down the process of exact optimization in the case that we are involved with the additional factor of robustness in the model.

Fig. 10
figure 10

The required execution time by increasing the number of duties

Fig. 11
figure 11

The required execution time by increasing the number of vehicles

Fig. 12
figure 12

Required execution times by increasing the number trips

Fig. 13
figure 13

Required execution times by increasing the number of depots

At the end of this subsection, we report on some conducted experiments with increasing amounts of buffer times and the results in terms of required execution times and objective values are shown in Figs. 14 and 15, respectively. As it can be observed, the objective function values worsen by ensuring longer buffer times and also the problem becomes harder. Therefore, more time is required to solve the problem.

Fig. 14
figure 14

Execution times by increasing the buffer times (BT)

Fig. 15
figure 15

Objective values by increasing the buffer times (BT)

3.3 Adding robustness to the model

Bearing in mind that the problem inputs frequently vary in the real world and have a non-deterministic nature, the concept of robustness is added to the model in this subsection. This is done by considering three different possible values or scenarios for all the coefficients existing in the objective function (1), which we call here \(Z_{O}\). For this sake, each generated value of \(c^d_{ij}\), \(e_l\), \(r^m\), \(\lambda _T\) and \(\lambda _O\) can be replaced by a set as \(\{0.9*fi, fi, 1.1*fi\}\), where fi is the fixed input value. Then, numerous scenarios are generated by the combination of these possible coefficient values and the objective function value is calculated for each of them. The objective value of the robust model is the maximum or worst among the objective values of all scenarios because in this way we can ensure that in any case the objective cannot be worse than that. This is in accordance with the definition of robustness given in Ben-Tal et al. (2009). So, a robust version of the first model presented in Sect. 3.1 can be built by replacing the objective (1) or z with \(\max _{All\ scenarios} {z}\). The same instances are solved for the robust models and the required execution times as well as the objective function values by a modified number of vehicles are shown and compared with those of the base model in Figs. 16 and 17, respectively.

Fig. 16
figure 16

Execution times (s) to achieve robust results vs. execution times of the base model for modified numbers of vehicles

Fig. 17
figure 17

Robust objective values vs. objective values of the base model for modified numbers of vehicles

The robust results show that the execution time or the computational complexity of the problem does not considerably increase in comparison to the non-robust model. However, the objective value is moderately deteriorated in the robust model. This is expected because the worse values among so many cases are chosen (based on the different realizations of the variation with respect to the fi-values).

4 Conclusions

In this research, the scalability of an extended integrated vehicle and crew scheduling (rostering) model from literature is examined. Extended problem settings incorporating buffer times and delay propagation do not harm the solvability of the model with standard solvers. The main achievement is the verification of the possibility of having new elements incorporated into a model that resorts to general purpose solvers rather than the need to use specially tailored algorithms. The results have been obtained for the same size of instances that have been tackled in the focused model from literature and beyond. A quote that we can borrow from Wolsey (2002) from the field of production planning and lot sizing from quite some time ago may even start holding in the realm of public transport: “there is a nontrivial fraction of practical [...] problems that can now be solved by nonspecialists just by taking an appropriate a priori reformulation of the problem, and then feeding the resulting formulation into a commercial mixed-integer programming solver”.

As a direction for future research, we envisage the incorporation of stochastic elements with uncertainty sets regarding demand (beyond the first attempts of incorporating robustness as shown in this paper), the incorporation of load-dependent traffic situations as well as the consideration of an uncertain availability of vehicles. Besides, it would also be good to explore other ”richness” features such as rostering constraints involving a maximum number of days off within a month to discover their effect on the computation time, although it is expected that they increase the complexity of the problem non-trivially.

Another option for future work refers to setting up a general framework for replication studies. As this has not yet been extensively studied, we just propose to set it up along the lines provided in the literature review.